strix icon indicating copy to clipboard operation
strix copied to clipboard

Fix rate limiting for Anthropic Claude API with robust multi-agent support

Open SFRDevelopment opened this issue 1 month ago • 0 comments

Problem

Strix encounters RateLimitError when using Anthropic Claude models, especially when multiple agents run in parallel. The default rate limiting configuration (1.0s delay, 6 concurrent requests) is too aggressive for Anthropic's tier 1 limits (50 requests/minute), causing failures when multiple agents spawn simultaneously.

Solution

Implemented robust rate limiting specifically for Anthropic/Claude models:

  • Increased delay: 1.0s → 3.0s (~20 req/min, 60% under 50 req/min limit)
  • Reduced concurrency: 6 → 1 (prevents burst requests from parallel agents)
  • Added 10% safety buffer: Each request waits 10% longer than calculated (3.3s effective = ~18 req/min)
  • Enhanced error handling: 60-second backoff on rate limit errors before retry
  • Thread-safe implementation: Lock-based synchronization ensures global queue works correctly across all agents

Changes

  • Modified strix/llm/request_queue.py:
    • Auto-detect Anthropic models via STRIX_LLM environment variable
    • Apply conservative rate limits (3.0s delay, 1 concurrent) for Anthropic
    • Add 10% safety buffer to delay calculation for Anthropic models
    • Implement 60s backoff on rate limit errors for Anthropic
    • Support manual override via STRIX_RATE_LIMIT_DELAY and STRIX_RATE_LIMIT_CONCURRENT environment variables
    • Log rate limiting configuration on initialization

Testing

  • Tested with anthropic/claude-sonnet-4-20250514 and anthropic/claude-3-5-haiku-20241022
  • Verified with multiple parallel agents (5+ agents running simultaneously)
  • No rate limit errors observed during extended scans
  • Successfully handles burst scenarios when multiple agents spawn at once

Rate Limit Math

  • Anthropic Limit: 50 requests/minute = ~1.2s per request
  • Configured: 3.0s delay + 10% buffer = 3.3s effective = ~18 req/min
  • Safety margin: 64% under limit (very conservative for multi-agent scenarios)

Backward Compatibility

  • Fully backward compatible
  • Only affects Anthropic/Claude models automatically
  • Other providers (OpenAI, etc.) use default settings unless manually configured
  • Manual configuration via environment variables works for all providers

Related Issues

This addresses rate limiting issues reported by users when using Anthropic Claude models, especially in multi-agent scenarios.

SFRDevelopment avatar Nov 15 '25 23:11 SFRDevelopment