[Bug]: Hitting rate limits doesn't appear in UI
Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).
- [x] I have checked the existing issues.
Describe the bug and reproduction steps
When rate limits are encountered there are errors in the backend log, but nothing happens in the UI
OpenHands Installation
Development workflow
OpenHands Version
No response
Operating System
Linux
Logs, Errors, Screenshots, and Additional Context
Here a rate limit error by litellm
raise RateLimitError(
litellm.exceptions.RateLimitError: litellm.RateLimitError: litellm.RateLimitError: VertexAIException - {
"error": {
"code": 429,
"message": "You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"quotaMetric": "generativelanguage.googleapis.com/generate_requests_per_model_per_day",
"quotaId": "GenerateRequestsPerDayPerProjectPerModel"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Learn more about Gemini API quotas",
"url": "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
}
]
}
}
I thought we added an agent status in the UI that showed rate limits. @raymyers I remember you put that in, did you not?
At least for me nothin shows up in the UI, do I have to configure something?
With claude3.7 it works for me.
Just to be clear. We are talking about the status botton left not turning yellow and showing "Rate Limited" right? For you it shows a generic, red error-status like in the screenshot, right?
The rate-limit error is set here
https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/controller/agent_controller.py#L253
elif isinstance(e, RateLimitError):
await self.set_agent_state_to(AgentState.RATE_LIMITED)
return
That might not always work. The exception is thrown from
We should catch any of https://github.com/All-Hands-AI/OpenHands/blob/cd9d96766c3a46ecbdccdd33a08eb3b4c8b49ecb/openhands/llm/llm.py#L41
Not just RateLimitError.
https://github.com/All-Hands-AI/OpenHands/blob/cd9d96766c3a46ecbdccdd33a08eb3b4c8b49ecb/openhands/llm/llm.py#L198
I thought this might be similiar to https://github.com/All-Hands-AI/OpenHands/pull/7548 where we needed to catch RetryException, but here we have reraise=True from RetryMixin.py.
Can you try to use this branch: https://github.com/All-Hands-AI/OpenHands/pull/7970
The error in the original post is litellm.exceptions.RateLimitError: , isn't it? It's the same as the one for Claude. It would / should show up in the status message just the same. 🤔
Yes, the original post refers a RateLimitError. But that would have worked. I assume, the reason we have LLM_RETRY_EXCEPTIONS with for example InternalServerError is that sometimes I have seen Claude sent a "overloaded_error" (or something like that), which was treated by openhands as an error. But effectively just meant I had to try again. I thought that was the issue here.
I'll try to reproduce, maybe there was something small in the UI that I have missed. Not exactly easy to reproduce a RateLimitError 😅
I hear you, @happyherp , I'm pretty sure something weird in the Gemini API (too?) returns an InternalServerError without understandable reason - and that may be why we recently put InternalServerError in retries. Normally, 500 shouldn't be a retryable error. But it happens quite a lot during some evaluations... 😢
We didn't intend "overloaded", as far as I know, to be retryable... (though I see why it could be understood as similar to rate limits, it gets confusing)
Maybe we could instead... move InternalServerError from retries to evals, maybe here, so it triggers a re-run (if re-runs are enabled), but otherwise we don't retry on it in normal circumstances... 🤔 Just a thought! 500 is an unexpected error that the user should probably be informed about immediately, as an error. WDYT?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
@MischaPanch I believe this has been resolved.
Here's latest update for 3 mentioned error types:
- litellm InternalServerError
- litellm RateLimitError
- litellm Timeout
Retry logic is in place which will call LLM max 3 times. If it fails, it's visible in the UI. Timeout error shows red banner with error message directly from the API call + agent status RateLimitError shows agent status InternalSeverError shows red banner with generic text message + agent status
Thank you for looking into this @mislavlukach. I will close this issue.
If you think this issue was incorrectly closed, feel free to comment