llama_index
llama_index copied to clipboard
Azure OpenAI : Token indices sequence length is longer than the specified maximum sequence length for this model
` llm = AzureOpenAI(deployment_name="gpt-35-turbo", model_kwargs={ "api_key": openai.api_key, "api_base": openai.api_base, "api_type": openai.api_type, "api_version": openai.api_version, }) llm_predictor = LLMPredictor(llm=llm)
embedding_llm = LangchainEmbedding(OpenAIEmbeddings( ))
documents = SimpleDirectoryReader('/dbfs/FileStore/shared_uploads').load_data()
index = GPTSimpleVectorIndex(documents) `
Error: Token indices sequence length is longer than the specified maximum sequence length for this model (3481 > 1024). Running this sequence through the model will result in indexing errors
Then I get INFO:openai:error_code=None error_message='Too many inputs for model None. The max number of inputs is 1. We hope to increase the number of inputs per request soon. Please contact us through an Azure support request at: https://go.microsoft.com/fwlink/?linkid=2213926 for further questions.' error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False
Looks like langchain doesn't expose the contextsize for AzureOpenAI super well yet.
We can look into a quick fix on our side first!
The two error messages here are actually un-related
The first error is about the tokenizer (I am guessing you have python3.8. In that case, we use a tokenizer from huggingface rather than tiktoken, but the warning is harmless)
The second error is about the batch size most likely. You can set the embed_batch_size as such
embedding_llm = LangchainEmbedding(OpenAIEmbeddings(embed_batch_size=1))
Closing this issue for now! Feel free to re-open if needed