Fix rate limiting for Anthropic Claude API with robust multi-agent support

Open SFRDevelopment opened this issue 1 month ago • 0 comments

Problem

Strix encounters RateLimitError when using Anthropic Claude models, especially when multiple agents run in parallel. The default rate limiting configuration (1.0s delay, 6 concurrent requests) is too aggressive for Anthropic's tier 1 limits (50 requests/minute), causing failures when multiple agents spawn simultaneously.

Solution

Implemented robust rate limiting specifically for Anthropic/Claude models:

Increased delay: 1.0s → 3.0s (~20 req/min, 60% under 50 req/min limit)
Reduced concurrency: 6 → 1 (prevents burst requests from parallel agents)
Added 10% safety buffer: Each request waits 10% longer than calculated (3.3s effective = ~18 req/min)
Enhanced error handling: 60-second backoff on rate limit errors before retry
Thread-safe implementation: Lock-based synchronization ensures global queue works correctly across all agents

Changes

Modified strix/llm/request_queue.py:
- Auto-detect Anthropic models via STRIX_LLM environment variable
- Apply conservative rate limits (3.0s delay, 1 concurrent) for Anthropic
- Add 10% safety buffer to delay calculation for Anthropic models
- Implement 60s backoff on rate limit errors for Anthropic
- Support manual override via STRIX_RATE_LIMIT_DELAY and STRIX_RATE_LIMIT_CONCURRENT environment variables
- Log rate limiting configuration on initialization

Testing

Tested with anthropic/claude-sonnet-4-20250514 and anthropic/claude-3-5-haiku-20241022
Verified with multiple parallel agents (5+ agents running simultaneously)
No rate limit errors observed during extended scans
Successfully handles burst scenarios when multiple agents spawn at once

Rate Limit Math

Anthropic Limit: 50 requests/minute = ~1.2s per request
Configured: 3.0s delay + 10% buffer = 3.3s effective = ~18 req/min
Safety margin: 64% under limit (very conservative for multi-agent scenarios)

Backward Compatibility

Fully backward compatible
Only affects Anthropic/Claude models automatically
Other providers (OpenAI, etc.) use default settings unless manually configured
Manual configuration via environment variables works for all providers

Related Issues

This addresses rate limiting issues reported by users when using Anthropic Claude models, especially in multi-agent scenarios.

Nov 15 '25 23:11 SFRDevelopment