agno Feature / Suggestion

In phi/llm/anthropic/claude.py, I think the invoke and invoke_stream methods would benifit from a try-except around the return statement to catch rate limit errors with a backoff mechanism. Something like this might work [WARNING: UNTESTED CODE]

class ExponentialBackoff:
    def __init__(self, base: float = 2, max_retries: int = 5, max_backoff: float = 60):
        self.base = base
        self.max_retries = max_retries
        self.max_backoff = max_backoff
        self.retry_count = 0

    def backoff(self):
        backoff = self.base ** self.retry_count
        self.retry_count += 1
        return min(backoff, self.max_backoff)

def invoke(self, messages: List[Message]) -> AnthropicMessage:
    api_kwargs: Dict[str, Any] = self.api_kwargs
    api_messages: List[dict] = []

    for m in messages:
        if m.role == "system":
            api_kwargs["system"] = m.content
        else:
            api_messages.append({"role": m.role, "content": m.content or ""})

    backoff = ExponentialBackoff()
    while True:
        try:
            return self.client.messages.create(
                model=self.model,
                messages=api_messages,
                **api_kwargs,
            )
        except RateLimitError as e:
            if backoff.retry_count > backoff.max_retries:
                raise e  # Maximum retries exceeded, raise the exception
            
            delay = backoff.backoff()
            print(f"Rate limit exceeded. Retrying in {delay} seconds...")
            time.sleep(delay)

May 10 '24 21:05 jonny7737

Really good idea, will test and probably release this week.

May 11 '24 12:05 ashpreetbedi

Another possible approach is to apply a decorator to invoke and invoke_stream functions that enforces a rate limit. This approach is more proactive but might require changes elsewhere to enable configuring (setting) the rate limit to enforce.

Great work on this repo. Everybody says it but it is worth repeating.

May 11 '24 18:05 jonny7737

@jonny7737 decorators are a great idea (maybe using tenacity). Thank you for your help in making this better. im working on this :)

as for a timeline, im tinkering with a new concept and after putting that out will work on the retry logic as that seems to be a p0 for a number of use-cases

May 12 '24 14:05 ashpreetbedi

Hope I have contributed in some small way. Thanks for listening.

Keep up the great work!

May 12 '24 14:05 jonny7737