sample-app-aoai-chatGPT
                                
                                
                                
                                    sample-app-aoai-chatGPT copied to clipboard
                            
                            
                            
                        Unexpected Keyword Argument 'dimensions' in embeddings.create() Call
Bug Report: Unexpected Keyword Argument 'dimensions' in embeddings.create() Call
Description
I am encountering an error while running the script scripts/data_preparation.py. It appears that the dimensions argument is being passed to the embeddings.create() function in get_embedding(), but this function does not accept dimensions as an argument.
How to Reproduce
- 
Run the following command:
py scripts\data_preparation.py --config scripts\config.json --njobs=4 --form-rec-resource [FORM_REC_RESOURCE] --form-rec-key [FORM_REC_KEY] --embedding-model-endpoint [EMBDEDDING_MODEL_ENDPOINT] - 
Use the following configuration in
scripts/config.json:[ { "data_path": "[DATA_PATH]", "location": "eastus", "subscription_id": "[SUBSCRIPTION_ID]", "resource_group": "[RESOURCE_GROUP]", "search_service_name": "[SEARCH_SERVICE_NAME]", "index_name": "[INDEX_NAME]", "chunk_size": 1024, "token_overlap": 128, "semantic_config_name": "default", "language": "en", "vector_config_name": "default" } ] - 
Ensure the following environment variables are set:
FLAG_EMBEDDING_MODEL=AOAI FLAG_COHERE=ENGLISH FLAG_AAI=V3 VECTOR_DIMENSION=1536 AZURE_OPENAI_API_VERSION=2023-05-15 AZURE_OPENAI_ENDPOINT=[OPEN_AI_ENDPOINT] AZURE_OPENAI_API_KEY=[OPENAI_KEY] COHERE_MULTILINGUAL_ENDPOINT= COHERE_MULTILINGUAL_API_KEY= COHERE_ENGLISH_ENDPOINT= COHERE_ENGLISH_API_KEY= 
Error Message
The error message I encountered is:
Error getting embedding for chunk with error=Error getting embeddings with endpoint=[ENDPOINT] with error=Embeddings.create() got an unexpected keyword argument 'dimensions', retrying, current at 1 retry, 4 retries left
Suspected Cause
In the scripts/data_utils.py, within the get_embedding() function, the code is passing the dimensions argument to the embeddings.create() method, but the method signature does not expect a dimensions argument. Here's the relevant code snippet:
client = AzureOpenAI(api_version=api_version, azure_endpoint=base_url, api_key=api_key)
if FLAG_AOAI == "V2":
    embeddings = client.embeddings.create(model=deployment_id, input=text)
elif FLAG_AOAI == "V3":
    embeddings = client.embeddings.create(
        model=deployment_id, 
        input=text, 
        dimensions=int(os.getenv("VECTOR_DIMENSION", 1536))
    )
According to the documentation, the embeddings.create() function does not accept a dimensions argument. Here's the expected method signature:
def create(
    *,
    input: str | List[str] | List[int] | List[List[int]],
    model: str = 'text-embedding-ada-002',
    encoding_format: NotGiven | Literal['float', 'base64'] = NOT_GIVEN,
    user: str | NotGiven = NOT_GIVEN,
    extra_headers: Headers | None = None,
    extra_query: Query | None = None,
    extra_body: Body | None = None,
    timeout: float | Timeout | NotGiven | None = NOT_GIVEN
) -> CreateEmbeddingResponse
The method does not take a dimensions argument, which is likely causing the error.