OpenAI-DotNet icon indicating copy to clipboard operation
OpenAI-DotNet copied to clipboard

Add chunk size parameter to EmbeddingsRequest

Open sibbl opened this issue 1 year ago • 1 comments

Feature Request

Azure OpenAI only allows one single string to be part of an embeddings request. Other frameworks have a chunk_size or embed_batch_size parameter for this.

Describe the solution you'd like

I'd propose a int? ChunkSize = null parameter for the EmbeddingsRequest. If it's > 0, the there should be multiple requests being made with n lines per requests.

Describe alternatives you've considered

I did the chunking myself, but as other frameworks have this built-in, we might also want to add such a parameter here.

Additional context

Quote from MS docs about this limitation:

I am trying to use embeddings and received the error "InvalidRequestError: Too many inputs. The max number of inputs is 1." How do I fix this? This error typically occurs when you try to send a batch of text to embed in a single API request as an array. Currently Azure OpenAI does not support batching with embedding requests. Embeddings API calls should consist of a single string input per request. The string can be up to 8191 tokens in length when using the text-embedding-ada-002 (Version 2) model.

sibbl avatar Jun 05 '23 13:06 sibbl

Closing as it can't reproduce with the latest version of vue. see playground

edison1105 avatar Sep 24 '24 07:09 edison1105