pydantic-ai
pydantic-ai copied to clipboard
[BUG] Error responses are not handled correctly for google openapi/openrouter
In case the API returns a 429/Rate limit exceeded, pydantic-ai throws a date-time parsing exception instead of surfacing the appropriate error message from the API around RLE(rate-limit-exceeded).
This can easily be replicated by using openrouter with one of the free gemini models.
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
"google/gemini-2.0-flash-exp:free",
base_url="https://openrouter.ai/api/v1",
api_key="key",
)
agent = Agent(
model=model,
system_prompt='Be concise, reply with one sentence.',
)
result = agent.run_sync('Who are you?')
print(result.data)
The above returns -
Traceback (most recent call last):
File "/Users/sam/dev/openai/openai_demo.py", line 32, in <module>
result = agent.run_sync('Who are you?')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 327, in run_sync
return asyncio.get_event_loop().run_until_complete(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 255, in run
model_response, request_usage = await agent_model.request(messages, model_settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 152, in request
return self._process_response(response), _map_usage(response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 207, in _process_response
timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer
This happens because the error response is not correctly handled in _process_response -
ChatCompletion(id=None, choices=None, created=None, model=None, object=None, service_tier=None, system_fingerprint=None, usage=None, error={'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': '{\n "error": {\n "code": 429,\n "message": "Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-experimental. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.",\n "status": "RESOURCE_EXHAUSTED"\n }\n}\n', 'provider_name': 'Google'}}, user_id='user_...')
We should check for the presence of the error object and handle the other fields appropriately.
Note: I have noticed this with both google's OpenAI compat API and openrouter's gemini API.
This is what an example output response may look like
This seems to happen due to inappropriate handling of type-casting by the openai client. OpenAI client always casts the response to a ChatCompletion response type, which they allow additional fields to be set on.
https://github.com/openai/openai-python/blob/89d49335a02ac231925e5a514659c93322f29526/src/openai/_models.py#L87-L100
PRs welcome! Thanks for outlining the issue clearly.
is this issue fixed?
Nope, it is not fixed. I have the same bug with openrouter.
Hey, can you please hurry up. 2 months passed, but issue remains...
I am facing this issue when trying to use in the SQL Agent as well. The commit history was merged, not sure if it has been released or not.
I'm having the same problem. Is there any workaround possible? Is this only for Google models? Makes Openrouter pretty unusable with PydanticAI.
Having the same problem using OpenAI SDK.
I can't reproduce it with the following:
import os
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY'))
model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)
agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')
result = agent.run_sync('Who are you?')
print(result.data)
Can someone in this issue provide me with a snippet that raises the mentioned error?
I can't reproduce it with the following:
import os
from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider
provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY')) model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)
agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')
result = agent.run_sync('Who are you?') print(result.data)
Can someone in this issue provide me with a snippet that raises the mentioned error?
It most probably happens when you run out of daily requests limit for free models or make too much requests in a minute or so for a free model.
facing same issue when using the openrouter/quasar-alpha model
facing this issue with large context requests specifically for openrouter/quasar-alpha but not other models. were other people having this issue specifically with models with 1M context?
same issues
I can't reproduce it with the following:
import os
from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider
provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY')) model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)
agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')
result = agent.run_sync('Who are you?') print(result.data) Can someone in this issue provide me with a snippet that raises the mentioned error?
@Kludex
I ran into this error running this code when my 'base_url' was wrong. The llm host returned 200, but with no response body, despite there being no endpoint there.
My local llm host was lm studio and I saw this in it's log: "2025-04-14 17:23:10 [ERROR] Unexpected endpoint or method. (POST /chat/completions). Returning 200 anyway"
So it less a logic issue and more a hard to understand error I think. Of course if there is no response body you can't parse it into an int.
I already figured out my base_url was wrong so it's not an issue for me, but I like to try to be helpful.
from pydantic_ai import Agent
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel
# Error cause by: lm_studio=OpenAIProvider(base_url="http://127.0.0.1:1234", api_key="lm studio"
lm_studio=OpenAIProvider(base_url="http://127.0.0.1:1234/v1", api_key="lm studio")
model = OpenAIModel('roleplaiapp/llama-3.3-70b-instruct', provider=lm_studio)
agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')
result = agent.run_sync('Who are you?')
print(result.data)
Running into the same issue with:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
openai_model = OpenAIModel(
"openai/gpt-4.1-mini",
provider=OpenAIProvider(
base_url=os.environ.get("OPENROUTER_BASE_URL"),
api_key=os.environ.get("OPENROUTER_API_KEY")
)
)
agent = Agent(
model=openai_model,
system_prompt="You are a good assistant",
)
result = await agent.run('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
The cause of my issue was a wrong API_KEY. Started working with a correct API_KEY, but the error message was still not clear
and in my case.... the code works great when the length of the total prompt to be small..... I ran into this issues only for long context prompt....... but I'm using gpt-4o so I don't think 20k tokens seems to be a problem at all....
I think we really do need a separate openrouter provider. I am also having the same issue in a different way. I'm using QwQ-32B from openrouter + tool calling. I get the following exception:
Traceback (most recent call last):
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\opentelemetry\trace\__init__.py", line 587, in use_span
yield span
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\opentelemetry\sdk\trace\__init__.py", line 1105, in start_as_current_span
yield span
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\instrumented.py", line 209, in _instrument
yield finish
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\instrumented.py", line 127, in request
response, usage = await super().request(messages, model_settings, model_request_parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\wrapper.py", line 28, in request
return await self.wrapped.request(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\openai.py", line 199, in request
return self._process_response(response), _map_usage(response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\openai.py", line 296, in _process_response
timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer
I don't think OpenAI provider is going to change. I advice pydantic_ai maintainers to seriously address this as soon as possible if they really want to keep up with other major agentic frameworks, they don't realize a lot of their users use OpenRouter, since it's just easily adaptable and changeable to test different models. Here is my suggestions:
-
Completely different OpenRouter Provider that has a flexible setting, which can easly be adjusted based on different characteristics of models (or even it can automatically understand that at least for top 10 popular models?)
-
Make other providers have the ability for their base_url to be set. Some providers like anthropic doesn't work with open router in pydantic ai. But by giving us the ability to set the base urls for these providers, it may also help reduce these problems.
Hello everyone, the issue is still persisted up until this version 0.1.9, and the possible solution is to navigate to that method: Wrap it with
try:
timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
except TypeError:
timestamp = datetime.now(tz=timezone.utc)
... existing code ....
In some version, you would see response.created_at instead of response.created, but the workaround is the same (adding the exception path).
P/s: The issue is same on class OpenAIModel and OpenAIResponseModel so please take notice from that
I'm still seeing this issue as well.
It seems any error from Openrouter trips this up - as it expects elements of the response to be there, which are not when an error is returned.
I have added a line to raise the error properly, and it reveals the underlying issue (in my case trying to use a thinking model with tools) and it seems others had some other underlying cause.
If you fix the timestamp part, other elements of the response that it assume are present then cause problems further down the line. It needs to just raise the error and stop. Here is what I added instead of the timestamp handling code IchiruTake mentioned, but in the same place (ahead of that line - at the start of _process_response).
if hasattr(response,"error") and response.error!=None:
raise ModelHTTPError(status_code=response.error['code'], model_name=self.model_name, body=response.error['metadata']['raw'])
A workaround until #1870 is properly implemented:
OpenRouter sometimes returns malformed responses where response.created or response.choices is None, causing pydantic-ai to crash with TypeError: 'NoneType' object cannot be interpreted as an integer.
This custom model class patches the timestamp issue and properly triggers fallback models when OpenRouter returns invalid responses:
from dataclasses import dataclass
from datetime import UTC, datetime
from pydantic_ai.exceptions import ModelHTTPError
from pydantic_ai.messages import ModelResponse
from pydantic_ai.models.openai import OpenAIModel, chat
@dataclass(init=False)
class OpenRouterModel(OpenAIModel):
"""
Fixes OpenRouter's missing timestamps and malformed responses.
Raises proper errors to trigger fallback models when needed.
"""
def _process_response(self, response: chat.ChatCompletion) -> ModelResponse:
"""Handle OpenRouter's quirks before processing."""
# Check for completely missing response
if response is None:
raise ModelHTTPError(
status_code=502,
model_name=self.model_name,
body={"error": "Received None response from OpenRouter"},
)
# Fix missing timestamp
if not hasattr(response, "created") or response.created is None:
response.created = int(datetime.now(UTC).timestamp())
# Check for missing choices array
if not hasattr(response, "choices") or response.choices is None:
raise ModelHTTPError(
status_code=502,
model_name=self.model_name,
body={"error": "OpenRouter returned response with no choices"},
)
return super()._process_response(response)
# Usage:
model = OpenRouterModel(
model_name="google/gemini-2.0-flash-exp:free",
base_url="https://openrouter.ai/api/v1",
api_key="your-key-here",
)
This handles both issues and ensures proper fallback behavior when using FallbackModel.
@TechNickAI That's work! Thanks.
this might be fixed by #2226, then again, it might break some workarounds. Feedback welcome before I merge.