pydantic-ai icon indicating copy to clipboard operation
pydantic-ai copied to clipboard

[BUG] Error responses are not handled correctly for google openapi/openrouter

Open sambhav opened this issue 11 months ago • 2 comments

In case the API returns a 429/Rate limit exceeded, pydantic-ai throws a date-time parsing exception instead of surfacing the appropriate error message from the API around RLE(rate-limit-exceeded).

This can easily be replicated by using openrouter with one of the free gemini models.

from pydantic_ai import Agent

from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    "google/gemini-2.0-flash-exp:free",
    base_url="https://openrouter.ai/api/v1",
    api_key="key",
)

agent = Agent(
    model=model,
    system_prompt='Be concise, reply with one sentence.',  
)

result = agent.run_sync('Who are you?')
print(result.data)

The above returns -

Traceback (most recent call last):
  File "/Users/sam/dev/openai/openai_demo.py", line 32, in <module>
    result = agent.run_sync('Who are you?')
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 327, in run_sync
    return asyncio.get_event_loop().run_until_complete(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.6/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/agent.py", line 255, in run
    model_response, request_usage = await agent_model.request(messages, model_settings)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 152, in request
    return self._process_response(response), _map_usage(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/sam/dev/openai/.venv/lib/python3.12/site-packages/pydantic_ai/models/openai.py", line 207, in _process_response
    timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer

This happens because the error response is not correctly handled in _process_response -

ChatCompletion(id=None, choices=None, created=None, model=None, object=None, service_tier=None, system_fingerprint=None, usage=None, error={'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': '{\n  "error": {\n    "code": 429,\n    "message": "Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-experimental. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.",\n    "status": "RESOURCE_EXHAUSTED"\n  }\n}\n', 'provider_name': 'Google'}}, user_id='user_...')

We should check for the presence of the error object and handle the other fields appropriately.

Note: I have noticed this with both google's OpenAI compat API and openrouter's gemini API.

This is what an example output response may look like

sambhav avatar Dec 22 '24 21:12 sambhav

This seems to happen due to inappropriate handling of type-casting by the openai client. OpenAI client always casts the response to a ChatCompletion response type, which they allow additional fields to be set on.

https://github.com/openai/openai-python/blob/89d49335a02ac231925e5a514659c93322f29526/src/openai/_models.py#L87-L100

sambhav avatar Dec 22 '24 21:12 sambhav

PRs welcome! Thanks for outlining the issue clearly.

sydney-runkle avatar Dec 23 '24 13:12 sydney-runkle

is this issue fixed?

110kanishkamedankara110 avatar Feb 09 '25 11:02 110kanishkamedankara110

Nope, it is not fixed. I have the same bug with openrouter.

lixelv avatar Feb 18 '25 22:02 lixelv

Hey, can you please hurry up. 2 months passed, but issue remains...

lixelv avatar Feb 18 '25 23:02 lixelv

I am facing this issue when trying to use in the SQL Agent as well. The commit history was merged, not sure if it has been released or not.

mamkkl avatar Feb 24 '25 10:02 mamkkl

I'm having the same problem. Is there any workaround possible? Is this only for Google models? Makes Openrouter pretty unusable with PydanticAI.

LouisDeconinck avatar Mar 10 '25 22:03 LouisDeconinck

Having the same problem using OpenAI SDK.

AviSelvakumar avatar Mar 15 '25 19:03 AviSelvakumar

I can't reproduce it with the following:

import os

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY'))
model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)

agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')

result = agent.run_sync('Who are you?')
print(result.data)

Can someone in this issue provide me with a snippet that raises the mentioned error?

Kludex avatar Mar 17 '25 13:03 Kludex

I can't reproduce it with the following:

import os

from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider

provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY')) model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)

agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')

result = agent.run_sync('Who are you?') print(result.data)

Can someone in this issue provide me with a snippet that raises the mentioned error?

It most probably happens when you run out of daily requests limit for free models or make too much requests in a minute or so for a free model.

vladlen32230 avatar Mar 24 '25 13:03 vladlen32230

facing same issue when using the openrouter/quasar-alpha model

loveQt avatar Apr 05 '25 18:04 loveQt

facing this issue with large context requests specifically for openrouter/quasar-alpha but not other models. were other people having this issue specifically with models with 1M context?

siddiki8 avatar Apr 09 '25 01:04 siddiki8

same issues

keyuchen21 avatar Apr 12 '25 20:04 keyuchen21

I can't reproduce it with the following:

import os

from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIModel from pydantic_ai.providers.openai import OpenAIProvider

provider = OpenAIProvider(base_url='https://openrouter.ai/api/v1', api_key=os.getenv('OPENROUTER_API_KEY')) model = OpenAIModel('google/gemini-2.0-flash-exp:free', provider=provider)

agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')

result = agent.run_sync('Who are you?') print(result.data) Can someone in this issue provide me with a snippet that raises the mentioned error?

@Kludex

I ran into this error running this code when my 'base_url' was wrong. The llm host returned 200, but with no response body, despite there being no endpoint there.

My local llm host was lm studio and I saw this in it's log: "2025-04-14 17:23:10 [ERROR] Unexpected endpoint or method. (POST /chat/completions). Returning 200 anyway"

So it less a logic issue and more a hard to understand error I think. Of course if there is no response body you can't parse it into an int.

I already figured out my base_url was wrong so it's not an issue for me, but I like to try to be helpful.

from pydantic_ai import Agent
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIModel

# Error cause by: lm_studio=OpenAIProvider(base_url="http://127.0.0.1:1234", api_key="lm studio"
lm_studio=OpenAIProvider(base_url="http://127.0.0.1:1234/v1", api_key="lm studio")

model = OpenAIModel('roleplaiapp/llama-3.3-70b-instruct', provider=lm_studio)

agent = Agent(model=model, system_prompt='Be concise, reply with one sentence.')

result = agent.run_sync('Who are you?')
print(result.data)

Grallen avatar Apr 14 '25 21:04 Grallen

Running into the same issue with:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

openai_model = OpenAIModel(
    "openai/gpt-4.1-mini",
    provider=OpenAIProvider(
        base_url=os.environ.get("OPENROUTER_BASE_URL"),
        api_key=os.environ.get("OPENROUTER_API_KEY")
    )
)

agent = Agent(
    model=openai_model,
    system_prompt="You are a good assistant",
)

result = await agent.run('Where does "hello world" come from?')  
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""

The cause of my issue was a wrong API_KEY. Started working with a correct API_KEY, but the error message was still not clear

kiranscaria avatar Apr 23 '25 03:04 kiranscaria

and in my case.... the code works great when the length of the total prompt to be small..... I ran into this issues only for long context prompt....... but I'm using gpt-4o so I don't think 20k tokens seems to be a problem at all....

keyuchen21 avatar Apr 23 '25 13:04 keyuchen21

I think we really do need a separate openrouter provider. I am also having the same issue in a different way. I'm using QwQ-32B from openrouter + tool calling. I get the following exception:

Traceback (most recent call last):
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\opentelemetry\trace\__init__.py", line 587, in use_span
    yield span
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\opentelemetry\sdk\trace\__init__.py", line 1105, in start_as_current_span
    yield span
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\instrumented.py", line 209, in _instrument
    yield finish
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\instrumented.py", line 127, in request
    response, usage = await super().request(messages, model_settings, model_request_parameters)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\wrapper.py", line 28, in request
    return await self.wrapped.request(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\openai.py", line 199, in request
    return self._process_response(response), _map_usage(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Amirreza\Desktop\Oga buga\monorepo\.venv\Lib\site-packages\pydantic_ai\models\openai.py", line 296, in _process_response
    timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer

I don't think OpenAI provider is going to change. I advice pydantic_ai maintainers to seriously address this as soon as possible if they really want to keep up with other major agentic frameworks, they don't realize a lot of their users use OpenRouter, since it's just easily adaptable and changeable to test different models. Here is my suggestions:

  1. Completely different OpenRouter Provider that has a flexible setting, which can easly be adjusted based on different characteristics of models (or even it can automatically understand that at least for top 10 popular models?)

  2. Make other providers have the ability for their base_url to be set. Some providers like anthropic doesn't work with open router in pydantic ai. But by giving us the ability to set the base urls for these providers, it may also help reduce these problems.

ItzAmirreza avatar Apr 23 '25 16:04 ItzAmirreza

Hello everyone, the issue is still persisted up until this version 0.1.9, and the possible solution is to navigate to that method: Wrap it with

try: 
    timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc)
except TypeError:
    timestamp = datetime.now(tz=timezone.utc)
... existing code ....

In some version, you would see response.created_at instead of response.created, but the workaround is the same (adding the exception path). P/s: The issue is same on class OpenAIModel and OpenAIResponseModel so please take notice from that

IchiruTake avatar May 04 '25 16:05 IchiruTake

I'm still seeing this issue as well.

Siafu avatar May 06 '25 12:05 Siafu

It seems any error from Openrouter trips this up - as it expects elements of the response to be there, which are not when an error is returned.

I have added a line to raise the error properly, and it reveals the underlying issue (in my case trying to use a thinking model with tools) and it seems others had some other underlying cause.

If you fix the timestamp part, other elements of the response that it assume are present then cause problems further down the line. It needs to just raise the error and stop. Here is what I added instead of the timestamp handling code IchiruTake mentioned, but in the same place (ahead of that line - at the start of _process_response).

if hasattr(response,"error") and response.error!=None:
            raise ModelHTTPError(status_code=response.error['code'], model_name=self.model_name, body=response.error['metadata']['raw'])

hillman avatar May 13 '25 08:05 hillman

A workaround until #1870 is properly implemented:

OpenRouter sometimes returns malformed responses where response.created or response.choices is None, causing pydantic-ai to crash with TypeError: 'NoneType' object cannot be interpreted as an integer.

This custom model class patches the timestamp issue and properly triggers fallback models when OpenRouter returns invalid responses:

from dataclasses import dataclass
from datetime import UTC, datetime

from pydantic_ai.exceptions import ModelHTTPError
from pydantic_ai.messages import ModelResponse
from pydantic_ai.models.openai import OpenAIModel, chat


@dataclass(init=False)
class OpenRouterModel(OpenAIModel):
    """
    Fixes OpenRouter's missing timestamps and malformed responses.
    Raises proper errors to trigger fallback models when needed.
    """

    def _process_response(self, response: chat.ChatCompletion) -> ModelResponse:
        """Handle OpenRouter's quirks before processing."""
        # Check for completely missing response
        if response is None:
            raise ModelHTTPError(
                status_code=502,
                model_name=self.model_name,
                body={"error": "Received None response from OpenRouter"},
            )

        # Fix missing timestamp
        if not hasattr(response, "created") or response.created is None:
            response.created = int(datetime.now(UTC).timestamp())

        # Check for missing choices array
        if not hasattr(response, "choices") or response.choices is None:
            raise ModelHTTPError(
                status_code=502,
                model_name=self.model_name,
                body={"error": "OpenRouter returned response with no choices"},
            )

        return super()._process_response(response)


# Usage:
model = OpenRouterModel(
    model_name="google/gemini-2.0-flash-exp:free",
    base_url="https://openrouter.ai/api/v1",
    api_key="your-key-here",
)

This handles both issues and ensures proper fallback behavior when using FallbackModel.

TechNickAI avatar Jul 08 '25 15:07 TechNickAI

@TechNickAI That's work! Thanks.

nmhjklnm avatar Jul 09 '25 13:07 nmhjklnm

this might be fixed by #2226, then again, it might break some workarounds. Feedback welcome before I merge.

samuelcolvin avatar Jul 17 '25 03:07 samuelcolvin