strix
strix copied to clipboard
Fix rate limiting for Anthropic Claude API with robust multi-agent support
Problem
Strix encounters RateLimitError when using Anthropic Claude models, especially when multiple agents run in parallel. The default rate limiting configuration (1.0s delay, 6 concurrent requests) is too aggressive for Anthropic's tier 1 limits (50 requests/minute), causing failures when multiple agents spawn simultaneously.
Solution
Implemented robust rate limiting specifically for Anthropic/Claude models:
- Increased delay: 1.0s → 3.0s (~20 req/min, 60% under 50 req/min limit)
- Reduced concurrency: 6 → 1 (prevents burst requests from parallel agents)
- Added 10% safety buffer: Each request waits 10% longer than calculated (3.3s effective = ~18 req/min)
- Enhanced error handling: 60-second backoff on rate limit errors before retry
- Thread-safe implementation: Lock-based synchronization ensures global queue works correctly across all agents
Changes
- Modified
strix/llm/request_queue.py:- Auto-detect Anthropic models via
STRIX_LLMenvironment variable - Apply conservative rate limits (3.0s delay, 1 concurrent) for Anthropic
- Add 10% safety buffer to delay calculation for Anthropic models
- Implement 60s backoff on rate limit errors for Anthropic
- Support manual override via
STRIX_RATE_LIMIT_DELAYandSTRIX_RATE_LIMIT_CONCURRENTenvironment variables - Log rate limiting configuration on initialization
- Auto-detect Anthropic models via
Testing
- Tested with
anthropic/claude-sonnet-4-20250514andanthropic/claude-3-5-haiku-20241022 - Verified with multiple parallel agents (5+ agents running simultaneously)
- No rate limit errors observed during extended scans
- Successfully handles burst scenarios when multiple agents spawn at once
Rate Limit Math
- Anthropic Limit: 50 requests/minute = ~1.2s per request
- Configured: 3.0s delay + 10% buffer = 3.3s effective = ~18 req/min
- Safety margin: 64% under limit (very conservative for multi-agent scenarios)
Backward Compatibility
- Fully backward compatible
- Only affects Anthropic/Claude models automatically
- Other providers (OpenAI, etc.) use default settings unless manually configured
- Manual configuration via environment variables works for all providers
Related Issues
This addresses rate limiting issues reported by users when using Anthropic Claude models, especially in multi-agent scenarios.