HTTP 429 Too Many Requests
Before submitting your bug report
- [X] I believe this is a bug. I'll try to join the Continue Discord for questions
- [X] I'm not able to find an open issue that reports the same bug
- [X] I've seen the troubleshooting guide on the Continue Docs
Relevant environment info
- OS:Windows 11
- Continue:0.8.52
- IDE:VS Code 1.92.1
- Model:llama-3.1-8b-instant
- config.json:
"tabAutocompleteModel": {
"title": "llama-3.1-8b-instant",
"model": "llama3.1-8b",
"contextLength": 131072,
"apiKey": "********",
"completionOptions": {},
"provider": "groq"
Description
This occurred right after I opened a txt file to edit a couple of things. I wasn't using Continue or attempting any tab completions. As a matter of fact, I didn't even have the side tab with Continue open.
Looking at the 'Continue: LLM...' it appears that Continue sent* (see below) the contents of the TXT file I was reading over and over again until Groq's request limit prevented it from sending anymore. I received 5 of the HTTP 429 error messages (see screenshot below) all at once, so my guess is that it was sending the requests back to back without stopping.
- Continue sent to Groq the following on repeat:
- "========================================================================== ========================================================================== Settings: contextLength: 131072 model: llama3.1-8b maxTokens: 2048 temperature: 0.01 stop: <fim_prefix>,<fim_suffix>,<fim_middle>,<file_sep>,<|endoftext|>,</fim_middle>,,
,
,/src/,#- coding: utf-8,```, function, class, module, export, import raw: true log: undefined
############################################ <fim_prefix> "The content of my txt..." <fim_suffix><fim_middle>
...then it would repeat from the beginning..."
To reproduce
No response
Log output
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.969s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.969s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
c @ notificationsAlerts.ts:42
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.287s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.287s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
c @ notificationsAlerts.ts:42
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 399ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 399ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
c @ notificationsAlerts.ts:42
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.042s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.042s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
c @ notificationsAlerts.ts:42
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.006s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 1.006s. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
c @ notificationsAlerts.ts:42
console.ts:137 [Extension Host] Error generating autocompletion: Error: HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions
{"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 526ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
at customFetch (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104571:19)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at withExponentialBackoff (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104258:26)
at _Groq._streamChat (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483821:26)
at _Groq._streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:483778:26)
at _Groq.streamComplete (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:104668:26)
at ListenableGenerator._start (c:\Users\AlexJ\.vscode\extensions\continue.continue-0.9.211-win32-x64\out\extension.js:96495:28)
C @ console.ts:137
notificationsAlerts.ts:42 HTTP 429 Too Many Requests from https://api.groq.com/openai/v1/chat/completions {"error":{"message":"Rate limit reached for model `llama-3.1-8b-instant` in organization `org_01ht7j91w5fjx8yabp2xh9hq37` on requests per minute (RPM): Limit 30, Used 30, Requested 1. Please try again in 526ms. Visit https://console.groq.com/docs/rate-limits for more information.","type":"requests","code":"rate_limit_exceeded"}}
facing same issue is this been resolved ?
facing same issue is this been resolved ?
Nope. This is the first response that I have gotten. As for me I just shut off tab completion so I could continue working.
bump
facing same issue is this been resolved ?
@HusainBhattiwala The closest thing to an answer that I have at the moment is located in this thread: https://github.com/continuedev/continue/issues/2457#issuecomment-2402693231
Not that this helps as I have used Anthropic, Gemini, and Groq with similar issues. That is not to mention the major question which is why is the autocomplete sending so many requests as to time out an API which I have used historically in the same fashion.... I imagine that we are not the only ones experiencing this issue.
Amplifying. I don't have this issue yet, but reading #2457 and #2343, I think commenters mistakenly attribute this (#2343) to a problem with the rate-limit, while the reporter makes a good case it's Continue stuck in a loop:
it appears that Continue sent* ... the contents of the TXT file ... over and over again until Groq's request limit prevented it from sending anymore. I received 5 of the HTTP 429 error messages ... all at once, so my guess is that it was sending the requests back to back without stopping.
There is a suggestion on the other thread that upgrading solves #2457. Maybe this too?
Amplifying. I don't have this issue yet, but reading #2457 and #2343, I think commenters mistakenly attribute this (#2343) to a problem with the rate-limit, while the reporter makes a good case it's Continue stuck in a loop:
it appears that Continue sent* ... the contents of the TXT file ... over and over again until Groq's request limit prevented it from sending anymore. I received 5 of the HTTP 429 error messages ... all at once, so my guess is that it was sending the requests back to back without stopping.
There is a suggestion on the other thread that upgrading solves #2457. Maybe this too?
Thank you for commenting! I'll look into the potential remedy.
To be honest, I've been working on some other things and haven't earnestly attempted to find a solution.
I received this error upon sending a message in chat, without receiving a response. I reproduced it three times. The message and context was small. Disabling autocomplete (as mentioned) allowed my next attempt to succeed.
Update: Some requests later in the same chat (autocomplete still disabled), the error occurred again consistently. Restarting VS code did not help. Starting a new chat seemed to have resolved the error, but I instead just got the error after a response (instead of before).
Is this error making me lose credits faster?
I've been watching my credit usage in Anthropic since before the errors and after. It jumped up by $1 of usage after only a few messages (with errors). Previous interaction (of a similar scope) was taking only small fractions of cents at a time.
Update: This may be because I had one chat that became very long, and I read (somewhere) that with each message in a long chat, Claude re-reads the entire chat.
I received this error upon sending a message in chat, without receiving a response. I reproduced it three times. The message and context was small. Disabling autocomplete (as mentioned) allowed my next attempt to succeed.
Update: Some requests later in the same chat (autocomplete still disabled), the error occurred again consistently. Restarting VS code did not help. Starting a new chat seemed to have resolved the error, but I instead just got the error after a response (instead of before).
Is this error making me lose credits faster? I've been watching my credit usage in Anthropic since before the errors and after. It jumped up by $1 of usage after only a few messages (with errors). Previous interaction (of a similar scope) was taking only small fractions of cents at a time.
Update: This may be because I had one chat that became very long, and I read (somewhere) that with each message in a long chat, Claude re-reads the entire chat.
I'm sorry to hear that you are having the same issue as I am. Perhaps you should open another issue thread because I think that at this point this one is going to remain open. We all may have better luck if there are multiple threads active. Just mention this one in that thread and it will link our threads together.
To answer your last question specifically... Yes, this will increase your usage. If you are using a paid service I would suggest, like you have been, keeping a close watch because it will likely send things multiple times. If the context is long....well... then it will be a lot.
Since this has been unresolved I have become very proficient with Aider and Cline (formerly ClaudeDev). That has kind of become my "AI Tool Stack" of sorts. For refactoring, feature addition, and other edit focused tasks I use Aider from CLI and to get an idea out of my head, planned out, and at least a MVP I have been using Cline. I will be lost if they ever pay wall those tools....
Hope this helps. Cheers.
This issue hasn't been updated in 90 days and will be closed after an additional 10 days without activity. If it's still important, please leave a comment and share any new information that would help us address the issue.
This issue was closed because it wasn't updated for 10 days after being marked stale. If it's still important, please reopen + comment and we'll gladly take another look!
Are there anyone taking a look on it