vscode-copilot-release Sorry, your request was rate-limited. Please wait and try sending again later.

I've already read that there is a new rate limit implemented and there was a fix, but after about 75 requests in about 2 hours and 15 minutes I just got the rate limit again.

Copilot

Version: 1.232.0
Build: prod
Editor: vscode/1.94.0-insider

Environment

http_proxy: n/a
https_proxy: n/a
no_proxy: n/a
SSL_CERT_FILE: n/a
SSL_CERT_DIR: n/a
OPENSSL_CONF: n/a

Feature Flags

Send Restricted Telemetry: disabled
Chat: enabled
Content exclusion: unavailable

Node setup

Number of root certificates: 147
Operating system: Linux
Operating system version: 5.15.153.1-microsoft-standard-WSL2
Operating system architecture: x64
NODE_OPTIONS: n/a
NODE_EXTRA_CA_CERTS: n/a
NODE_TLS_REJECT_UNAUTHORIZED: n/a
tls default min version: TLSv1.2
tls default max version: TLSv1.3

Network Configuration

Proxy host: n/a
Proxy port: n/a
Kerberos SPN: n/a
Reject unauthorized: disabled
Fetcher: HelixFetcher

Reachability

github.com: HTTP 200
api.github.com: HTTP 200
copilot-proxy.githubusercontent.com: HTTP 200
api.githubcopilot.com: HTTP 200
default.exp-tas.com: HTTP 200

VS Code Configuration

HTTP proxy:
HTTP proxy authentication: n/a
Proxy Strict SSL: true
Extension HTTP proxy support: override

Extensions

Is win-ca installed?: false
Is mac-ca installed?: false

Authentication

GitHub username: ########

Sep 20 '24 14:09 AdrianIAna

It is working now but who knows for how long.

The problem is that there is no time specified when a rate limit is reached and when it might expire, also there are no published rate limits provided anywhere and the ones I found, I'm far from it. Not happy with all this changes.

Sep 20 '24 16:09 AdrianIAna

Same issue

Sep 20 '24 18:09 cameronaaron

This rate limit is really bad. I use this in place of having 2 or 3 more devs on my team we can all do about that much work with this. Now where dead locked. Its not what I paid for.

Sep 22 '24 17:09 zbayle

This rate limit is really bad. I use this in place of having 2 or 3 more devs on my team we can all do about that much work with this. Now where dead locked. Its not what I paid for.

I'm of the same opinion, I just hit the limit again and I don't think I used it that much. Not happy with the way this has been implemented.

Sep 23 '24 14:09 AdrianIAna

@zbayle You're sitting in the top 100 copilot users based on our rate limit dashboard. During the period you got rate limited you made a request every minute non stop for multiple hours.

@AdrianIAna The same applies to you.

While I understand the frustration that comes with getting rate limited, these limits are in place to protect the overall GPU clusters and ensure sufficient capacity for all copilot users. There are only a small handful of users who receive rate limits (we're talking less than 0.01%) and it unfortunately is affecting you.

Some updates which may help

Newer versions of the client will now tell you how long to wait before trying a request again so you can understand how long you've been rate limited for
The rate limit is based on tokens not requests. So smaller prompts / conversations with less history will allow for more requests than those which are sending large files over. Since GPU usage is a directly related to number of tokens in a prompt and response.

Sep 23 '24 14:09 lramos15

@zbayle You're sitting in the top 100 copilot users based on our rate limit dashboard. During the period you got rate limited you made a request every minute non stop for multiple hours.

@AdrianIAna The same applies to you.

While I understand the frustration that comes with getting rate limited, these limits are in place to protect the overall GPU clusters and ensure sufficient capacity for all copilot users. There are only a small handful of users who receive rate limits (we're talking less than 0.01%) and it unfortunately is affecting you.

Some updates which may help

Newer versions of the client will now tell you how long to wait before trying a request again so you can understand how long you've been rate limited for

The rate limit is based on tokens not requests. So smaller prompts / conversations with less history will allow for more requests than those which are sending large files over. Since GPU usage is a directly related to number of tokens in a prompt and response.

I believe it would be helpful if rate limits were adjusted based on usage patterns, particularly during off-peak hours when the clusters are under less strain. This could provide more flexibility without affecting overall system performance.

Additionally, is there a way for users to monitor the number of tokens they are consuming? This would allow us to better manage our usage and avoid hitting the rate limits.

I’ll make an effort to be more selective with my prompts and use smaller ones to stay within the limits.

You mentioned that newer versions of the client would show the remaining wait time during rate limits. However, I'm using version 1.232.0 (the latest), and I haven’t seen that feature yet.

Lastly, for users willing to pay more, have you considered offering a Pro version that provides access to more tokens?

Sep 23 '24 15:09 AdrianIAna

Thanks for the feedback, these are all great points, and I will forward it to the team responsible for maintaining the service which powers Copilot.

As for the version, you will need VS Code insiders and that will allow you to install the latest pre-release of Copilot chat which is v0.21.2024092302.

I would also love to understand the usage pattern that is causing you to send so many requests and use such a high amount of tokens.

Sep 23 '24 15:09 lramos15

I keep getting stopped in my tracks by this. I am debugging complex code, and using Copilot constantly to help me do this , which is surely what it is for. I am able to formulate prompts, understand and test the response, ready to ask the next questions and it appears the service cannot keep up with this pace of work. Btw, I am not complaining, this is an amazing thing, which is why I am using it so much!

Sep 23 '24 15:09 CountBorgula

The big question is, how can you be charging real money for something, but throttle access to it? The suggestion of a pro level is a good one. I'd pay more for that.

Sep 23 '24 16:09 CountBorgula

Thanks for the feedback, these are all great points, and I will forward it to the team responsible for maintaining the service which powers Copilot.

As for the version, you will need VS Code insiders and that will allow you to install the latest pre-release of Copilot chat which is v0.21.2024092302.

I would also love to understand the usage pattern that is causing you to send so many requests and use such a high amount of tokens.

I'm in the same boat as CountBorgula, I sometimes use it as a conversational chat and pasting the error to find a solution, in this case a C program...

Sep 23 '24 19:09 AdrianIAna

@zbayle You're sitting in the top 100 copilot users based on our rate limit dashboard. During the period you got rate limited you made a request every minute non stop for multiple hours.

@AdrianIAna The same applies to you.

While I understand the frustration that comes with getting rate limited, these limits are in place to protect the overall GPU clusters and ensure sufficient capacity for all copilot users. There are only a small handful of users who receive rate limits (we're talking less than 0.01%) and it unfortunately is affecting you.

Some updates which may help

Newer versions of the client will now tell you how long to wait before trying a request again so you can understand how long you've been rate limited for

The rate limit is based on tokens not requests. So smaller prompts / conversations with less history will allow for more requests than those which are sending large files over. Since GPU usage is a directly related to number of tokens in a prompt and response.

I get that and that is interesting to know. I would expect there to be way more active users or maybe I'm leaning to much on the AI. A simple solution would be to add tokens used / tokens total to the bottom of each really. That would enough for most people to track and improve usage.

Sep 26 '24 18:09 zbayle

Maybe it means tell your boss github says you're done for the day and bye.

I ran into the limit definitely relying on the ai. Also it craps out a ton of repetitive information so I'm going to try to make my prompts ask for less verbosity and repetitiveness. Smaller chunks at a time.

Sep 29 '24 01:09 nestorwheelock

I just started seeing this message after updating to the latest version of GC. I have never seen this message before and I don't think I use it so much to be rate limited but now I'm seeing it. I'm not using it more than I have in the past. The worst part is that it says to wait xx seconds before trying again but that doesn't work. When I wait the amount of time, it gives me another message saying to wait more. What gives?

Oct 03 '24 19:10 r0bdiabl0

@r0bdiabl0 I'm not seeing anything about you getting rate limited in our dashboard. Can you please share your copilot chat logs. They can be grabbed via CMD / CTRL + SHIFT +U -> GitHub Copilot Chat

Oct 03 '24 19:10 lramos15

@r0bdiabl0 I'm not seeing anything about you getting rate limited in our dashboard. Can you please share your copilot chat logs. They can be grabbed via CMD / CTRL + SHIFT +U -> GitHub Copilot Chat

2024-10-03 13:46:21.128 [info] Using the Electron fetcher. 2024-10-03 13:46:21.128 [info] Initializing Git extension service. 2024-10-03 13:46:21.128 [info] Successfully activated the vscode.git extension. 2024-10-03 13:46:21.128 [info] Enablement state of the vscode.git extension: true. 2024-10-03 13:46:21.128 [info] Successfully registered Git commit message provider. 2024-10-03 13:46:21.265 [info] Logged in as r0bdiabl0 2024-10-03 13:46:21.937 [info] Got Copilot token for r0bdiabl0 2024-10-03 13:46:22.070 [info] copilot token chat_enabled: true, sku: monthly_subscriber 2024-10-03 13:46:22.075 [info] Registering default platform agent... 2024-10-03 13:46:22.076 [info] Successfully activated the GitHub.vscode-pull-request-github extension. 2024-10-03 13:46:22.076 [info] [githubTitleAndDescriptionProvider] Initializing GitHub PR title and description provider provider. 2024-10-03 13:46:22.076 [info] Successfully registered GitHub PR title and description provider. 2024-10-03 13:46:22.076 [info] Successfully registered GitHub PR reviewer comments provider. 2024-10-03 13:46:22.350 [info] Fetched model metadata in 402ms 65ef2d18-1d3f-467e-b804-fa459dbc09c1 2024-10-03 13:46:22.830 [info] activationBlocker from 'languageModelAccess' took for 2010ms 2024-10-03 13:49:49.166 [info] message 0 returned. finish reason: [stop] 2024-10-03 13:49:49.168 [info] request done: requestId: [489bec9e-2b54-4f90-848f-7fba25dec6fa] model deployment ID: [] 2024-10-03 13:49:53.486 [info] request done: requestId: [e37fc40b-2f8f-4ea2-9cdb-ce009bc50226] model deployment ID: [] 2024-10-03 13:50:07.846 [info] message 0 returned. finish reason: [stop] 2024-10-03 13:50:07.848 [info] request done: requestId: [259b6c04-172e-4576-8ebe-e4a5e0d1e4d5] model deployment ID: [] 2024-10-03 13:50:20.940 [info] message 0 returned. finish reason: [stop] 2024-10-03 13:50:20.941 [info] request done: requestId: [d9830bfc-c8d8-4e17-a84b-f612fd7a585b] model deployment ID: [] 2024-10-03 13:50:21.938 [info] Request ID for failed request: 72abebdd-4f13-4073-b922-a1db51ccf33e 2024-10-03 13:50:21.939 [error] Failed to fetch followups because of response type (rateLimited) and reason (rate limit exceeded ) 2024-10-03 13:59:43.720 [info] Fetched model metadata in 449ms 8d51ea20-8b6d-4b08-9e5e-1e54b7632399 2024-10-03 13:59:46.784 [info] message 0 returned. finish reason: [stop] 2024-10-03 13:59:46.784 [info] request done: requestId: [98d13518-3653-45c8-a4c2-e5115e4a33f6] model deployment ID: [] 2024-10-03 14:00:05.996 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:00:05.997 [info] request done: requestId: [41cbf977-2659-4d1e-b06d-65c56a9afc92] model deployment ID: [] 2024-10-03 14:00:07.868 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:00:07.869 [info] request done: requestId: [c59b1045-ae4b-4fa3-87bc-31f812da3e0f] model deployment ID: [] 2024-10-03 14:05:15.942 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:05:15.943 [info] request done: requestId: [4d3a9580-979f-43fe-b0d9-658fa8275868] model deployment ID: [] 2024-10-03 14:05:32.645 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:05:32.646 [info] request done: requestId: [f8d3cb28-a527-4782-add4-9d1dd7a6e7e7] model deployment ID: [] 2024-10-03 14:05:33.668 [info] Request ID for failed request: 83a557eb-224d-4a5d-bb7a-0c2e382b8077 2024-10-03 14:05:33.669 [error] Failed to fetch followups because of response type (rateLimited) and reason (rate limit exceeded ) 2024-10-03 14:07:07.249 [info] Using the Electron fetcher. 2024-10-03 14:07:07.249 [info] Initializing Git extension service. 2024-10-03 14:07:07.855 [info] Successfully activated the vscode.git extension. 2024-10-03 14:07:07.855 [info] Enablement state of the vscode.git extension: true. 2024-10-03 14:07:07.855 [info] Successfully registered Git commit message provider. 2024-10-03 14:07:08.385 [info] Logged in as r0bdiabl0 2024-10-03 14:07:11.753 [info] Got Copilot token for r0bdiabl0 2024-10-03 14:07:12.509 [info] Fetched model metadata in 733ms 0300dd93-e054-4278-8733-b9eedc36cc36 2024-10-03 14:07:13.117 [info] activationBlocker from 'languageModelAccess' took for 5846ms 2024-10-03 14:07:13.268 [info] copilot token chat_enabled: true, sku: monthly_subscriber 2024-10-03 14:07:13.277 [info] Registering default platform agent... 2024-10-03 14:07:13.278 [info] Successfully activated the GitHub.vscode-pull-request-github extension. 2024-10-03 14:07:13.278 [info] [githubTitleAndDescriptionProvider] Initializing GitHub PR title and description provider provider. 2024-10-03 14:07:13.278 [info] Successfully registered GitHub PR title and description provider. 2024-10-03 14:07:13.278 [info] Successfully registered GitHub PR reviewer comments provider. 2024-10-03 14:10:27.162 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:10:27.163 [info] request done: requestId: [8d63d583-b43f-49f3-8bed-5b249f84f52d] model deployment ID: [] 2024-10-03 14:10:27.928 [info] Request ID for failed request: 38016c98-e462-4693-b4d2-bf2af12f21c8 2024-10-03 14:10:50.996 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:10:50.997 [info] request done: requestId: [2a6ba5c5-10f8-4d06-b3e8-5b3b56393810] model deployment ID: [] 2024-10-03 14:10:51.571 [info] Request ID for failed request: 61eaf802-e6b8-4fb2-a15e-34e94588d0eb 2024-10-03 14:11:36.888 [info] Request ID for failed request: dea02367-33fe-40d2-b73f-ded884cc80a4 2024-10-03 14:11:37.412 [info] Request ID for failed request: 486c42a9-b130-43ba-8ac0-ca1bcb678e7e 2024-10-03 14:14:39.402 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:14:39.404 [info] request done: requestId: [492f3821-559a-45b9-9666-d2c469157ea5] model deployment ID: [] 2024-10-03 14:14:39.860 [info] Request ID for failed request: 9f52d7d8-56dd-48f0-9de9-c3deaeeae1f9 2024-10-03 14:18:23.727 [info] Fetched model metadata in 498ms dbe62d6a-4be5-44a4-a34f-0c748093b1f2 2024-10-03 14:18:24.915 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:18:24.915 [info] request done: requestId: [95a14507-e6b9-4754-ada3-a0c6b386c398] model deployment ID: [] 2024-10-03 14:18:35.872 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:18:35.873 [info] request done: requestId: [f5573e4e-838b-4a0e-9931-f4240d305497] model deployment ID: [] 2024-10-03 14:18:36.309 [info] Request ID for failed request: bcb76068-62a3-44e7-8af6-5e2ff7bd1029 2024-10-03 14:18:36.310 [error] Failed to fetch followups because of response type (rateLimited) and reason (rate limit exceeded ) 2024-10-03 14:19:38.716 [info] Request ID for failed request: f555ad79-3777-4ac9-8213-25f8d916ce10 2024-10-03 14:19:39.223 [info] Request ID for failed request: 99270a43-dad0-4a2f-91a7-849455cd1307 2024-10-03 14:20:15.078 [info] Request ID for failed request: f1d95614-ec24-4ad0-876d-7ae1e0c0178e 2024-10-03 14:20:15.614 [info] Request ID for failed request: 449984b1-67f7-423c-ab17-065567844f89 2024-10-03 14:27:35.509 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:27:35.510 [info] request done: requestId: [d67c69b0-3863-403a-b139-5c70dabaca17] model deployment ID: [] 2024-10-03 14:27:47.409 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:27:47.409 [info] request done: requestId: [c6697c23-a3ac-47f6-87e5-ad7bd60b890f] model deployment ID: [] 2024-10-03 14:27:49.132 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:27:49.133 [info] request done: requestId: [708c4dd9-254d-45a9-a677-dbdcef859b75] model deployment ID: [] 2024-10-03 14:32:11.728 [info] Logged in as r0bdiabl0 2024-10-03 14:32:12.342 [info] Got Copilot token for r0bdiabl0 2024-10-03 14:32:12.353 [info] copilot token chat_enabled: true, sku: monthly_subscriber 2024-10-03 14:32:12.779 [info] Fetched model metadata in 423ms 5d3fecbf-1cda-435d-a7c9-2534b2516055 2024-10-03 14:45:11.214 [info] Fetched model metadata in 440ms 9ce1b1aa-685d-47c1-8c23-3906d7fa331f 2024-10-03 14:45:12.393 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:45:12.393 [info] request done: requestId: [b21b83a4-ab2c-4ca5-a7ed-b09fc1034d85] model deployment ID: [] 2024-10-03 14:45:21.177 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:45:21.178 [info] request done: requestId: [e17ef495-e370-45ab-b114-e9fbd0ab721b] model deployment ID: [] 2024-10-03 14:45:22.347 [info] message 0 returned. finish reason: [stop] 2024-10-03 14:45:22.348 [info] request done: requestId: [17ac60c0-a87d-4e3d-a7d1-855acfc074e4] model deployment ID: [] 2024-10-03 14:57:12.388 [info] Logged in as r0bdiabl0 2024-10-03 14:57:13.357 [info] Got Copilot token for r0bdiabl0 2024-10-03 14:57:13.367 [info] copilot token chat_enabled: true, sku: monthly_subscriber 2024-10-03 14:57:13.682 [info] Fetched model metadata in 310ms 6c65302b-606d-4ff7-bd65-50ce11d1ae0e 2024-10-03 15:07:14.530 [info] Fetched model metadata in 475ms b4fa21ef-d1f5-46e2-bd7e-a3f9f4bf5a28

Oct 03 '24 20:10 r0bdiabl0

I am getting a rate limit for every 1-3ish message. The files we are working on are less than 200 lines. and VS Code shows me as logged in.

Oct 06 '24 22:10 fultrim

this is just stupid

the graph is too large, make sure its the same size as in the filteredorders.html

GitHub Copilot Sorry, your request was rate-limited. Please wait 20 seconds before trying again.

DaveGER just provide the code that needs to be changed

GitHub Copilot Sorry, your request was rate-limited. Please wait 34 seconds before trying again.

DaveGER hi

GitHub Copilot Sorry, your request was rate-limited. Please wait 23 seconds before trying again.

DaveGER the graph is too large, make sure its the same size as in the filteredorders.html

GitHub Copilot Sorry, your request was rate-limited. Please wait 77 seconds before trying again.

Oct 07 '24 21:10 DaveGER

Please help!

Same issue here. Got a paid account and all of a sudden hitting rate-limits. Cannot work like this. Please fix ASAP. Thank you :-)

I am using VS Code in version:

GitHub Copilot in version:

Still 5-Star-rating 👍 🥇 Nice tool. Love it. Could be more creative with code analysis, code suggestions and re-factoring code.

Thank you in advance for your help. 💯

Oct 08 '24 13:10 alexej-khalilzada

Please give VS Code insiders + pre-release a try. We've done some token optimizations there that should make it less likely for you to hit the rate limit.

Oct 08 '24 16:10 lramos15

Please give VS Code insiders + pre-release a try. We've done some token optimizations there that should make it less likely for you to hit the rate limit.

That doesn't really seem to solve the problem. Is the rate-limit IP-based? I'm using Cloudflare WARP, so that could explain why the rate-limit countdown seems to fluctuate at random. If not, then there's something very wrong with your rate-limit algorithm.

The entire point is to have back-and-forth code-correction and improvement dialogues with the AI to increase productivity and write more performant code. I always send the full module context to the AI, because it needs to see the full picture to give accurate answers. I assume it's also sending prior conversation history (including its full-reprint of my code with each response), thus maxing out the tokens with each exchange.

I feel like this shouldn't be an issue for users paying $10 a month, seeing how Copilot Chat is clearly using GPT-4o-mini, not the much more expensive 4o counterpart. Otherwise, what's the incentive to continue paying for this service instead of just using the free version of ChatGPT?

Oct 11 '24 18:10 Apoaptic

The rate limit is tied to your account. Not your IP. And it's based on the number of tokens you utilize which is a good measure of AI cost. The users that are receiving rate limits are in the top 0.01% of Copilot users, but we understand that getting rate limited is frustrating and are working to improve our limits and our code.

I feel like this shouldn't be an issue for users paying $10 a month, seeing how Copilot Chat is clearly using GPT-4o-mini

This is just false. We use GPT 4o.

Oct 11 '24 18:10 lramos15

The rate limit is tied to your account. Not your IP. And it's based on the number of tokens you utilize which is a good measure of AI cost. The users that are receiving rate limits are in the top 0.01% of Copilot users, but we understand that getting rate limited is frustrating and are working to improve our limits and our code.

Then either your metrics are unrealistic, or as I suggested earlier, there's a bug with your current rate limit algorithm. You're suggesting that 99.9% of developers using Copilot only send a one or two dozen full-context requests to the AI per hour during a coding session. If that's the expected use case, I must once again ask what the incentive is for users to pay for this service. That's not much higher than ChatGPT's 4o limit for free users.

@zbayle You're sitting in the top 100 copilot users based on our rate limit dashboard. During the period you got rate limited you made a request every minute non stop for multiple hours.

@AdrianIAna The same applies to you.

Your earlier dialogue further leads me to believe that the problem is with your rate limit algorithm or the extension itself. I can assure you that my usage did not come close to the rate suggested by your metrics, unless the extension is making requests in the background without my knowledge. I'm always recompiling and testing my code after each change, which takes at least a minute or two. To exceed the rates you gave, I'd have to be sitting there and spamming the extension with non-stop requests.

Additionally, there is still no explanation as to why the rate limit cooldown doesn't align with actual elapsed time (and sometimes even increases) upon retrying.

I feel like this shouldn't be an issue for users paying $10 a month, seeing how Copilot Chat is clearly using GPT-4o-mini

This is just false. We use GPT 4o.

The context limit and generation speed led me to believe otherwise. Does Github Copilot not use the OpenAI API?

Oct 11 '24 20:10 Apoaptic

I know it is frustrating and as I said we are working to improve things in this area. Please give Copilot Pre-release and VS Code insiders a try as we've done some work to optimize token usage there. I assure you that I've analyzed every rate limit report on this repository by hand and have yet to find any anomalies that would indicate the logic is working improperly. The rate limits are not public and therefore I am not able to discuss them.

Additionally, there is still no explanation as to why the rate limit cooldown doesn't align with actual elapsed time (and sometimes even increases) upon retrying.

I can provide this. The # of seconds shown is the # of seconds which is needed to complete the exact request that was just rate limited. However, we do multiple requests per turn of the conversation, so this isn't 100% accurate in terms of the number of seconds needed to wait. If you change your request in any way (i.e. different file) then it also is not accurate. This is a known bug.

The context limit and generation speed led me to believe otherwise. Does Github Copilot not use the OpenAI API?

Copilot is hosted on dedicated GPU capacity within Azure and is powered by Azure Open AI through a partnership with Open AI. This may account for the speed differences you notice.

Oct 11 '24 20:10 lramos15

I know it is frustrating and as I said we are working to improve things in this area. Please give Copilot Pre-release and VS Code insiders a try as we've done some work to optimize token usage there. I assure you that I've analyzed every rate limit report on this repository by hand and have yet to find any anomalies that would indicate the logic is working improperly. The rate limits are not public and therefore I am not able to discuss them.

I hit the rate-limit again shortly after switching to VSC Insiders, and I've been using the pre-release prior to that. After taking a break, I've yet to hit another rate limit error, but I also haven't been working uninterrupted, so I can't confirm whether or not the issue has been mitigated significantly. I'll put this issue to rest until/unless it happens again.

Additionally, there is still no explanation as to why the rate limit cooldown doesn't align with actual elapsed time (and sometimes even increases) upon retrying.

I can provide this. The # of seconds shown is the # of seconds which is needed to complete the exact request that was just rate limited. However, we do multiple requests per turn of the conversation, so this isn't 100% accurate in terms of the number of seconds needed to wait. If you change your request in any way (i.e. different file) then it also is not accurate. This is a known bug.

The context limit and generation speed led me to believe otherwise. Does Github Copilot not use the OpenAI API?

Copilot is hosted on dedicated GPU capacity within Azure and is powered by Azure Open AI through a partnership with Open AI. This may account for the speed differences you notice.

Thank you for the info, that explains a lot of my misgivings. I apologize for getting heated and making assumptions out of ignorance, and I appreciate your timely answers.

Oct 11 '24 20:10 Apoaptic

Sorry, your request was rate-limited. Please wait 29 minutes before trying again.

Using Copilot Pre-release and VS Code insiders. Limit encountered after barely half a workday of normal use.

Oct 12 '24 14:10 Apoaptic

I'm also hitting rate limits with VS Code insiders. I first hit the limits with GA, after which I installed Insiders. With Insiders it worked for a little while before hitting the rate limit again. I fully understand the explanations given above, but I think a lot of the frustration comes from the "please wait xx seconds" being not very accurate. Although it seems to be counting down (although not totally accurate), once it is supposed to hit zero, it just adds more time on the clock.

Additionally, it would be good to understand how the number of tokens spend/calls made can be reduced so that you don't hit your max too often. E.g. the "Apply in Editor" function seems to be using AI as well, whereas a copy/paste will probably not use any capacity.

At this point I'm almost ok with you adding "Watch this 30 seconds ad to continue using Copilot" so I can keep moving 😆

Oct 12 '24 23:10 scriptbased

The big question is, how can you be charging real money for something, but throttle access to it? The suggestion of a pro level is a good one. I'd pay more for that.

Paying doesn't help. We have a Team Github org with Business Copilot and face the same issues. The worst part is that Copilot spews off at the head, doesn't answer the question asked, you must then ask a different way, it spews again, and the entire content counts "against" you.

Oct 26 '24 14:10 TheComputerGenie

Sorry, your request was rate-limited. Please wait 29 minutes before trying again.

Using Copilot Pre-release and VS Code insiders. Limit encountered after barely half a workday of normal use.

Be glad you made it that long. We're paying through the nose and didn't make it but a couple hours!

Oct 26 '24 14:10 TheComputerGenie

The big question is, how can you be charging real money for something, but throttle access to it? The suggestion of a pro level is a good one. I'd pay more for that.

Paying doesn't help. We have a Team Github org with Business Copilot and face the same issues. The worst part is that Copilot spews off at the head, doesn't answer the question asked, you must then ask a different way, it spews again, and the entire content counts "against" you.

I think that is one of the main token sinks. For a common example, if I show the AI my full class/module to give it the context it needs to debug, it'll often times spit back the entire thing verbatim and then gaslight me into thinking it actually changed something beyond removing a few comments, at best (thank goodness for git diffs). So often times, I'll end up both sending and receiving thousands of tokens multiple times in short succession while rewording the question to get it to finally do what I asked. Also, even if you cancel it mid-generation, it'll still continue generating in the background and mark those tokens against you.

I've also tried giving it custom instructions through a copilot-instructions.md file to sway it into only responding with relevant code snippets, but I've found that it'll hard-refuse your requests half the time when the client attaches that file to your prompts.

Oct 26 '24 23:10 Apoaptic

. Also, even if you cancel it mid-generation, it'll still continue generating in the background and mark those tokens against you.

This should not be the case.

We've also gone ahead and upped the rate limit significantly over the past week and will continue to monitor this issue

Oct 28 '24 14:10 lramos15