[Bug]: Pydantic warnings at every embedding call for Azure
What happened?
I am seeing a ton of these in my logs:
UserWarning: Pydantic serializer warnings:
Expected list[float] but got str - serialized value may not be as expected
I added these lines at the top of the litellm app:
import warnings warnings.filterwarnings("error")
and got this tracktrace:
#9 10.27 Traceback (most recent call last): #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 1108, in embeddings #9 10.27 response = await llm_router.aembedding(data) #9 10.27 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 374, in aembedding #9 10.27 return await litellm.aembedding({**data, "input": input, "caching": self.cache_responses, "client": model_client, **kwargs}) #9 10.27 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 1792, in wrapper_async #9 10.27 raise e #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 1723, in wrapper_async #9 10.27 result = await original_function(*args, **kwargs) #9 10.27 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 1772, in aembedding #9 10.27 raise exception_type( #9 10.27 ^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 5194, in exception_type #9 10.27 raise original_exception #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 1765, in aembedding #9 10.27 response = await init_response #9 10.27 ^^^^^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/llms/azure.py", line 361, in aembedding #9 10.27 raise e #9 10.27 File "/usr/local/lib/python3.11/site-packages/litellm/llms/azure.py", line 344, in aembedding #9 10.27 stringified_response = response.model_dump_json() #9 10.27 ^^^^^^^^^^^^^^^^^^^^^^^^^^ #9 10.27 File "/usr/local/lib/python3.11/site-packages/pydantic/main.py", line 352, in model_dump_json #9 10.27 return self.pydantic_serializer.to_json(
Relevant log output
No response
Twitter / LinkedIn details
No response
I'm trying to reproduce this error on my side, what request are you sending the proxy?
I tried this and don't see any warnings
curl --location 'http://0.0.0.0:8000/embeddings' \
--header 'Content-Type: application/json' \
--data ' {
"model": "azure-embedding-model",
"input": ["write a poem", "hi"]
}'
I am making calls like this to the proxy with openai version < 1.0:
embeddings = await openai.Embedding.acreate(model=model, input=['random sentence here'], user='foo')
looking into this
- I don't see this error with
curlrequests to the proxy
okay, I can repro this now:
I see this warning:
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pydantic/main.py:352: UserWarning: Pydantic serializer warnings:
Expected `list[float]` but got `str` - serialized value may not be as expected
return self.__pydantic_serializer__.to_json(
When making this request to the proxy:
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:8000"
)
embedding_response = client.embeddings.create(
model="azure/azure-embedding-model",
input=["random sentence here"],
user="food"
)
print(embedding_response)
we only see this when making requests with the OpenAI python package because the package sends requests with 'encoding_format': 'base64' and those requests respond with str instead of list[float]
when running model_dump_json() it expects List[float] https://github.com/openai/openai-python/blob/main/src/openai/types/embedding.py
able to repro this error with this code snippet using the OpenAI package
import openai
client = openai.AsyncAzureOpenAI(
api_key=os.environ["AZURE_API_KEY"],
azure_endpoint=os.environ["AZURE_API_BASE"]
)
async def _test():
response = await client.embeddings.create(
model="azure-embedding-model",
input=["write a litellm poem"],
encoding_format="base64"
)
print(response)
response = response.model_dump_json()
print(response)
import json
response = json.loads(response)
print(response)
import asyncio
asyncio.run(_test())
Fix 1 is to stop using this line: stringified_response = response.model_dump_json()