OpenHands Use litellm Router for rate limiting and/or fallback LLMs

Summary

Litellm has the Router class that encapsulates completion with rate limits handling. We can look into using it, because it should allow us to define a RetryPolicy hopefully based on how long the provider has left (though in my reading, it doesn't yet). It does allow to define a fall back LLM in case one provider runs out of tries. (https://github.com/All-Hands-AI/OpenHands/issues/1263)

Rate limit headers for OpenAI: https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers

Rate limit headers for Anthropic: https://docs.anthropic.com/en/api/rate-limits#response-headers

Technical Design

Replace completion direct call to litellm with Router.completion

Alternatives to Consider

Continue to do it ourselves. Various providers have different rate limits, so our options are:

don't get the remaining time, and think again of some sensible defaults, user-configurable; better documentation
get the remaining time from liteLLM

Fall back LLM:

do it ourselves
configure litellm

Sep 25 '24 22:09 enyst

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Nov 02 '24 01:11 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

Nov 12 '24 01:11 github-actions[bot]

Bouncing this back to here https://github.com/All-Hands-AI/OpenHands/issues/4184

Sep 04 '25 01:09 BradKML