OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Use litellm Router for rate limiting and/or fallback LLMs

Open enyst opened this issue 1 year ago • 1 comments

Summary

Litellm has the Router class that encapsulates completion with rate limits handling. We can look into using it, because it should allow us to define a RetryPolicy hopefully based on how long the provider has left (though in my reading, it doesn't yet). It does allow to define a fall back LLM in case one provider runs out of tries. (https://github.com/All-Hands-AI/OpenHands/issues/1263)

Rate limit headers for OpenAI: https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers

Rate limit headers for Anthropic: https://docs.anthropic.com/en/api/rate-limits#response-headers

Technical Design

Replace completion direct call to litellm with Router.completion

Alternatives to Consider

Continue to do it ourselves. Various providers have different rate limits, so our options are:

  • don't get the remaining time, and think again of some sensible defaults, user-configurable; better documentation
  • get the remaining time from liteLLM

Fall back LLM:

  • do it ourselves
  • configure litellm

enyst avatar Sep 25 '24 22:09 enyst

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Nov 02 '24 01:11 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Nov 12 '24 01:11 github-actions[bot]

Bouncing this back to here https://github.com/All-Hands-AI/OpenHands/issues/4184

BradKML avatar Sep 04 '25 01:09 BradKML