Anthropic rate limit too many connections issue
What happened?
When I use Anthropic models I get:
429 {"type":"error","error":{"type":"rate_limit_error","message":"Number of concurrent connections has exceeded your rate limit. Please try again later or contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase."}}
This happens after a couple of file reads, it happens almost always.
Steps to reproduce
- Use Anthropic provider, use sonnet 4 or similar
- Give it a task that involves reading a few files
- Wait for it to go back and forth few times reading the file
- boom
Relevant API REQUEST output
Provider/Model
anthropic:claude-sonnet-4
Operating System
MacOS Ventura
System Info
Macbook pro
Cline Version
3.17.15
Additional context
I know this has been reported before. I know there is a retry, but the retry also fails, usually also fails the second time, I have to wait minutes.
No, this is not the same as rate_limit_error because of 40k or 80k tokens. This is explicitly about connections.
I think Cline should somehow:
- Reuse the existing connection, if HTTP 1.1 is used, don't know when Anthropic checks for connection rate limits
- Close previous connections correctly
- Wait some time for the previous connection to be "closed", meaning Anthropic noticing it
Using another provider that also happens to support Anthropic is not a fix.
Thanks for reporting this @SimoneGianni , are you using multiple Cline windows via multiple VSCode windows? Or are you encountering this 429 error for multiple concurrent connections, using only 1 cline window, within 1 task?
Only one window, only one task, it starts reading files, at file 4 or 5 I inevitably get the error.
It happens when it READS 4 or 5 files, because it's very fast at doing so.
Try writing 10 md files with something in it, then tell Cline to read the files in the folder and then do something, it will start reading the files one after the other and trigger the problem.
When it is modifying files, given the time for the model to produce the tokens and Cline to apply them (and me to review them) it's not fast enough to trigger the error.
I suggest using our Cline provider, or any other provider that doesn't have such strict rate limits.