OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

[Bug]: Hitting rate limits doesn't appear in UI

Open MischaPanch opened this issue 8 months ago • 11 comments

Is there an existing issue for the same bug? (If one exists, thumbs up or comment on the issue instead).

  • [x] I have checked the existing issues.

Describe the bug and reproduction steps

When rate limits are encountered there are errors in the backend log, but nothing happens in the UI

OpenHands Installation

Development workflow

OpenHands Version

No response

Operating System

Linux

Logs, Errors, Screenshots, and Additional Context

Here a rate limit error by litellm

  raise RateLimitError(
litellm.exceptions.RateLimitError: litellm.RateLimitError: litellm.RateLimitError: VertexAIException - {
  "error": {
    "code": 429,
    "message": "You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_requests_per_model_per_day",
            "quotaId": "GenerateRequestsPerDayPerProjectPerModel"
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Learn more about Gemini API quotas",
            "url": "https://ai.google.dev/gemini-api/docs/rate-limits"
          }
        ]
      }
    ]
  }
}

MischaPanch avatar Apr 18 '25 14:04 MischaPanch

I thought we added an agent status in the UI that showed rate limits. @raymyers I remember you put that in, did you not?

mamoodi avatar Apr 18 '25 14:04 mamoodi

At least for me nothin shows up in the UI, do I have to configure something?

MischaPanch avatar Apr 18 '25 15:04 MischaPanch

With claude3.7 it works for me.

Just to be clear. We are talking about the status botton left not turning yellow and showing "Rate Limited" right? For you it shows a generic, red error-status like in the screenshot, right?

Image

happyherp avatar Apr 21 '25 08:04 happyherp

The rate-limit error is set here

https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/controller/agent_controller.py#L253

            elif isinstance(e, RateLimitError):
                await self.set_agent_state_to(AgentState.RATE_LIMITED)
                return

That might not always work. The exception is thrown from

We should catch any of https://github.com/All-Hands-AI/OpenHands/blob/cd9d96766c3a46ecbdccdd33a08eb3b4c8b49ecb/openhands/llm/llm.py#L41

Not just RateLimitError.

https://github.com/All-Hands-AI/OpenHands/blob/cd9d96766c3a46ecbdccdd33a08eb3b4c8b49ecb/openhands/llm/llm.py#L198

I thought this might be similiar to https://github.com/All-Hands-AI/OpenHands/pull/7548 where we needed to catch RetryException, but here we have reraise=True from RetryMixin.py.

happyherp avatar Apr 21 '25 08:04 happyherp

Can you try to use this branch: https://github.com/All-Hands-AI/OpenHands/pull/7970

happyherp avatar Apr 21 '25 09:04 happyherp

The error in the original post is litellm.exceptions.RateLimitError: , isn't it? It's the same as the one for Claude. It would / should show up in the status message just the same. 🤔

enyst avatar Apr 21 '25 13:04 enyst

Yes, the original post refers a RateLimitError. But that would have worked. I assume, the reason we have LLM_RETRY_EXCEPTIONS with for example InternalServerError is that sometimes I have seen Claude sent a "overloaded_error" (or something like that), which was treated by openhands as an error. But effectively just meant I had to try again. I thought that was the issue here.

happyherp avatar Apr 21 '25 20:04 happyherp

I'll try to reproduce, maybe there was something small in the UI that I have missed. Not exactly easy to reproduce a RateLimitError 😅

MischaPanch avatar Apr 21 '25 20:04 MischaPanch

I hear you, @happyherp , I'm pretty sure something weird in the Gemini API (too?) returns an InternalServerError without understandable reason - and that may be why we recently put InternalServerError in retries. Normally, 500 shouldn't be a retryable error. But it happens quite a lot during some evaluations... 😢

We didn't intend "overloaded", as far as I know, to be retryable... (though I see why it could be understood as similar to rate limits, it gets confusing)

Maybe we could instead... move InternalServerError from retries to evals, maybe here, so it triggers a re-run (if re-runs are enabled), but otherwise we don't retry on it in normal circumstances... 🤔 Just a thought! 500 is an unexpected error that the user should probably be informed about immediately, as an error. WDYT?

enyst avatar Apr 21 '25 20:04 enyst

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 05 '25 02:06 github-actions[bot]

@MischaPanch I believe this has been resolved.

Here's latest update for 3 mentioned error types:

  1. litellm InternalServerError
  2. litellm RateLimitError
  3. litellm Timeout

Retry logic is in place which will call LLM max 3 times. If it fails, it's visible in the UI. Timeout error shows red banner with error message directly from the API call + agent status RateLimitError shows agent status InternalSeverError shows red banner with generic text message + agent status

Image Image Image

mislavlukach avatar Jun 17 '25 19:06 mislavlukach

Thank you for looking into this @mislavlukach. I will close this issue.

If you think this issue was incorrectly closed, feel free to comment

amanape avatar Jun 18 '25 11:06 amanape