openai-python
openai-python copied to clipboard
Slower than expected performance after upgrading
Confirm this is an issue with the Python library and not an underlying OpenAI API
- [X] This is an issue with the Python library
Describe the bug
I'm not absolutely certain if this is an issue with the Python library, but after upgrading from v0.28.0 to v1.10.0, we noticed a significant increase in latency (by about a factor of 4x) when requesting embeddings via an Azure OpenAI ada v2 deployment. This was confirmed in the Azure portal, where latency was indeed about 4x higher immediately after we deployed our service using the upgraded package. After downgrading back to v0.28.0, the issue resolved itself.
To Reproduce
- Create an AzureOpenAI client
- Request embeddings with the client. For reference, we send about 2-3k embedding requests per 5 minutes
Code snippets
This is how we query Azure with v0.28.0:
import openai
import os
texts = ["this", "is", "a", "test"]
embedding_args = {
"api_type": AZURE_API_TYPE,
"api_version": AZURE_API_VERSION,
"api_key": os.getenv("AZURE_OPENAI_API_KEY"),
"api_base": os.getenv("AZURE_OPENAI_API_BASE"),
"deployment_id": os.getenv("AZURE_OPENAI_DEPLOYMENT_ID"),
"input": texts
}
res = openai.Embedding.create(**embedding_args)
With v1.10.0:
from openai import AzureOpenAI
import os
texts = ["this", "is", "a", "test"]
openai_client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version=AZURE_API_VERSION,
azure_endpoint=os.getenv("AZURE_OPENAI_API_BASE")
)
create_args = {
"model": os.getenv("AZURE_OPENAI_DEPLOYMENT_ID"),
"input": texts
}
res = openai_client.embeddings.create(**create_args)
OS
debian:bullseye-slim
Python version
Python v3.11.7
Library version
v1.10.0
cc @RobertCraigie can you take a look?
@achempak-polymer thanks for the report, do you have numpy installed?
@RobertCraigie yup I do, the latest version
@achempak-polymer can you share any more details?
- what model do you have deployed?
- could you reproduce the decreased performance with your example snippets or did this only occur with larger inputs?
I can't reproduce this against the main OpenAI API, both versions are taking about 0.5-1s with your inputs and the text-embedding-3-large model.