vscode-copilot-release icon indicating copy to clipboard operation
vscode-copilot-release copied to clipboard

Agent mode hitting rate limit, with Copilot Pro subscription

Open wowtah opened this issue 8 months ago • 65 comments

Version: 1.99.0 (system setup) Commit: 4437686ffebaf200fa4a6e6e67f735f3edf24ada Date: 2025-04-02T21:35:19.530Z Electron: 34.3.2 ElectronBuildId: 11161073 Chromium: 132.0.6834.210 Node.js: 20.18.3 V8: 13.2.152.41-electron.0 OS: Windows_NT x64 10.0.22631

GitHub Copilot Version 1.296.0 GitHub Copilot Chat Version 0.26.0

According to the Github Copilot Plan specs, Agent mode should have no usage cap in a Pro subscription.

However, after some usage, I get:

Sorry, you have exhausted the agent mode usage limit. Please switch to ask mode and try again later

My subscription: Image

The Pro plan spec: Image Link to plan specification

My expectation is that the Pro plan has no rate limiting on Agent mode.

Also, the documentation keeps mentioning 'Base model'. But what is the base model? How do I select it? My only options in the models menu are:

  • GPT-4o
  • Claude 3.5 Sonnet
  • Claude 3.7 Sonnet

wowtah avatar Apr 07 '25 12:04 wowtah

I upgraded to Pro+ thinking I'd misunderstood what tier got "Unlimited Agent Mode and Chats with Base Model"... I feel like I got robbed a bit?

chadmoore avatar Apr 07 '25 17:04 chadmoore

There are some secret limits on ALL subscription plans (even Enterprise). It is mentioned in discussion: https://github.com/orgs/community/discussions/148896

So unlimited has to mean something else than without the limits, I suppose?

macie avatar Apr 07 '25 19:04 macie

Just to provide some clarity, we're investigating this but believe these are just normal rate limits which just means you've exceeded limits we have set for fair use and they will reset in a few hours. The unlimited being referenced in the marketing material just means you can do as many requests and while you may be throttled you'll never be charged. We understand the confusion and are working to clarify the wording here.

lramos15 avatar Apr 07 '25 20:04 lramos15

I understand that there is a limit to make it "fair" but we actually pay a subscription for this "unlimited" term because some users need to use it all day long whether for development or not.

KaynoxDev avatar Apr 07 '25 20:04 KaynoxDev

Just to provide some clarity, we're investigating this but believe these are just normal rate limits which just means you've exceeded limits we have set for fair use and they will reset in a few hours. The unlimited being referenced in the marketing material just means you can do as many requests and while you may be throttled you'll never be charged. We understand the confusion and are working to clarify the wording here.

First of all: thanks for investigating! I really hope the rate limit I am hitting is not the intended one. Because... I am not doing anything excessive/crazy I think.

I am just in my normal dev routine and have switched from Copilot Edit mode to Agent mode. I ask the Agent to do things for me in my project, and it does them pretty well. But after around an hour of continuous use I run into the rate limiter, and then it feels like I have to take a break for another hour, to avoid hitting the limiter too soon again. This really breaks my dev-flow. And the rate limiter never kicks in at a convenient moment ofc (I have had it kick in in the middle of quite large refactoring operations, and sometimes it wouldn't pick up properly where it stopped).

To summarize: my expectation is that a (paid) tool designed to work a certain way (Copilot Agent has full control over how many requests it does, how efficient these requests are, etc. ) should never run into a rate limiter in an average/intended user scenario. Especially when the paid plan specification states: unlimited ;)

wowtah avatar Apr 07 '25 21:04 wowtah

I just have started to use agent mode today and after just a few requests i got the "limit reached" message, i hope this is just a kind of bug, i have the Pro subscription.

rubmode avatar Apr 07 '25 22:04 rubmode

I just updated the plugin and all the models i had before to use dissapeared limiting me to 3 models, agent mode not even available

dinchu avatar Apr 07 '25 23:04 dinchu

False Advertisement

Everyone should realize what this is

laz-001 avatar Apr 08 '25 01:04 laz-001

A 'hate-inducing' bug.

The unlimited being referenced in the marketing material just means you can do as many requests and while you may be throttled you'll never be charged.

@lramos15 , beyond the technical detail (to not show the user when a limit will kick in, thus it doesn't appear mid-work), there is a legal matter, too.

=> False Advertisement

=> Illegal

This is the kind of bug that makes you wanting nothing more to do with a company.

A 'hate-inducing' bug.

laz-001 avatar Apr 08 '25 01:04 laz-001

Up i've got this problem too

AlexandreServignat avatar Apr 08 '25 08:04 AlexandreServignat

me too, back to Cursor

hcentelles avatar Apr 08 '25 12:04 hcentelles

It seems that no clear info about rate limiting is a known issue since January 2024 (I believe microsoft/vscode-copilot-release#732 is the first one and microsoft/vscode#253124 is the leading one).

@lramos15 Are there any plans to tackle the problem in the near future?

macie avatar Apr 08 '25 16:04 macie

Just to provide some clarity, we're investigating this but believe these are just normal rate limits which just means you've exceeded limits we have set for fair use and they will reset in a few hours. The unlimited being referenced in the marketing material just means you can do as many requests and while you may be throttled you'll never be charged. We understand the confusion and are working to clarify the wording here.

@lramos15 In addition to clarifying the wording can you also publish what the fair use limits are? I've been making use of Agent mode to do some refactoring (which it's doing a great job at) but I've hit the invisible rate limits with 45 of 242 files processed. I don't know when I can try again.

I'd be perfectly happy if the Agent mode carried on in the background at a slower pace so that it kept itself within the rate limits. Overnight processing of tasks like this would be amazing. Maybe even relax the fair use policies off-peak (if there are any off-peak hours...!)

MatthewSteeples avatar Apr 08 '25 23:04 MatthewSteeples

Same here. Hit the message Sorry, you have exhausted the agent mode usage limit. Please switch to ask mode and try again later. while executing a larger refactoring and cleanup. Now i am stuck in the middle of the work with no clue when I can continue. . Very poor experience (otherwise the Agent Model works very well).

I was working "manual" and did not automate anything that would lead to excessive prompts towards the Agent.

  • Advertised as "Unlimited" is clearly misleading (at least give a specific number like 30 prompts/ hour or similar)
  • No warning message before hitting the limit is (give us a heads up before reaching the limit out of nowhere)
  • No clear communication when we can continue working (like "wait for n minutes and try again")

Pro Subscription active: Image

hbertsch avatar Apr 09 '25 08:04 hbertsch

Corrected Advertisements Folks, some numbers:

https://github.blog/changelog/2025-04-04-announcing-github-copilot-pro

pro ($10)=> 300 requests new pro+ ($39) => 1500 requests

Not quite, Premium Requests adds to the confusion here though. Premium Requests are requests to any model other than 4o (source), and these limits haven't gone into effect yet (this starts May 5th).

The first time I hit this issue I thought Premium Requests was the explanation too, but it still occurs with 4o.

This (from your link) states premium requests aren't the same as agent mode requests, and agent mode has unlimited requests when using the base model (4o):

Enjoy all the features you love from GitHub Copilot Pro along with exclusive access to the latest models (GPT-4.5 is available today), priority access to previews, and 1500 premium requests per month when they go live on May 5th. This is in addition to the unlimited requests for agent mode, context-driven chat, and code completions that all paid plans have when using our base model.

steveAG avatar Apr 09 '25 11:04 steveAG

Thanks for pointing this out. I did not see this before @steveAG. The Usage/Payment model is not straight forward (as it is often when using cloud based services....).

So the important part is here:

  • If you have Copilot Free enabled, your GitHub account comes with up to 2,000 code completions and up to 50 chats or premium requests per month.
  • If you're on a paid plan, you get unlimited code completions, unlimited agent requests, and unlimited chat interactions using the base model. You also receive a monthly allowance of premium requests -> Copilot Free (50) | Copilot Pro (300) | Copilot Pro+ (1500) see this
  • Premium request consumption is depending on the model: Image

hbertsch avatar Apr 09 '25 12:04 hbertsch

Thanks for pointing this out. I did not see this before @steveAG. The Usage/Payment model is not straight forward (as it is often when using cloud based services....).

So the important part is here:

  • If you have Copilot Free enabled, your GitHub account comes with up to 2,000 code completions and up to 50 chats or premium requests per month.
  • If you're on a paid plan, you get unlimited code completions, unlimited agent requests, and unlimited chat interactions using the base model. You also receive a monthly allowance of premium requests -> Copilot Free (50) | Copilot Pro (300) | Copilot Pro+ (1500) see this
  • Premium request consumption is depending on the model: Image

I got the "Sorry, you have exhausted the agent mode usage limit. Please switch to ask mode and try again later" using Agent Mode with 4o on the Pro plan, so not Unlimited, i think is "Throttled Unlimited". The question is what's the throttling threshold? More clarity on that would be appreciated.

hcentelles avatar Apr 09 '25 14:04 hcentelles

I can confirm. After running into the rate limiter with Sonnet, when switching to GPT-4o, it will still state:

Sorry, you have exhausted the agent mode usage limit. Please switch to ask mode and try again later

wowtah avatar Apr 09 '25 16:04 wowtah

Paid plan, two hours into normal usage (first time using agent mode) doing a feature add and got this. This is not abuse and there is no reason to assume I would hit a limit just through normal use, especially on a paid plan. You guys need to fix two things:

  1. When we hit the limit, how long do we need to wait to continue? am I stuck now for 4 days or 20 minutes?
  2. I imagine you guys guessed the limits of a new tool, but clearly you set them way too low as we are not abusing the system - just trying to do normal work - I'm adding a feature that impacts all of 8 files. It's unfair to advertise it one way, get us all using it in the middle and then crap out on us.

MosheTzvi avatar Apr 09 '25 19:04 MosheTzvi

Just to provide some clarity, we're investigating this but believe these are just normal rate limits which just means you've exceeded limits we have set for fair use and they will reset in a few hours. The unlimited being referenced in the marketing material just means you can do as many requests and while you may be throttled you'll never be charged. We understand the confusion and are working to clarify the wording here.

It's been 2 days, when will these fair use limits be published?

edit: Here are the rate limits. https://docs.github.com/en/github-models/prototyping-with-ai-models#rate-limits

Not sure why they are not listed in a more prominent area, but its all there. Limits per minute, per day, token limits, and concurrent request limits.

GPT-4o is considered a HIGH model according to the marketplace, so

Rate limits Copilot Free Copilot Pro Copilot Business Copilot Enterprise Requests per minute 10 10 10 15 Requests per day 50 50 100 150 Tokens per request 8000 in, 4000 out 8000 in, 4000 out 8000 in, 4000 out 16000 in, 8000 out Concurrent requests 2 2 2 4

amerizalde avatar Apr 09 '25 21:04 amerizalde

Not sure why they are not listed in a more prominent area, but its all there. Limits per minute, per day, token limits, and concurrent request limits.

Because those are not the right rate limits.

That's the rate limits for GitHub Models which is like a Model Playground / API for users to build their own LLM applications. It's not for Copilot chat.

lramos15 avatar Apr 10 '25 13:04 lramos15

We all going to win a class action lawsuit because Microsoft provides little to no clarity? We paid for a service under the impression that we'd get "unlimited" usage but Microsoft is very clearly limiting our usage.
But class action lawsuits only hurt Microsoft, provides us all with some chump change, and does not fix the overlying problem - we still don't get unlimited usage. We'll each end up with an extra thirty cents and Microsoft will redefine the meaning of the word "unlimited."


Unrelated yet interesting: I had a VSCode update waiting for me, so after I'd hit my limit I restarted VSCode for the update. Agent mode began working from right where I'd left off.

...Never mind. That lasted for two messages.

imagic5 avatar Apr 12 '25 19:04 imagic5

My flow is all in vscode. Cursor works with that?

ndsimpson avatar Apr 13 '25 04:04 ndsimpson

I have found that occasionally the agent tends to get itself in inefficient loops and starts to make a ton of requests of its own accord, so you hit the rate limit quite quickly.

Frankly, it seems a bit silly that something can generate a poorly architected or inefficient structure, and then just tire itself out trying to iterate on it. I also find you can't really review until it's done with itself anyway, which makes the "stop" pretty pointless unless you know your prompt was wrong, but I have found the agent mode to be far too greedy for my liking.

Maybe I'm bad at proompting. Or fussy about code quality. Or surprised when next-gen tech struggles with something "find and replace" has been getting on with just fine for decades.

Either way, for others running into this issue, my experience has been that that forcing the reasoning part of the agentic system to lay out an implementation plan and then stop what it's doing to await your input works fairly well and (somewhat) reduces the occurrences of such loops.

I am basically using this as a template:

Add the new feature <your_feature> with the following requirements:

* requirement 1
* requirement 2

Before generating any code, you must first review what actionable steps you will take and create an implementation plan, which you will store in the file `todo.md`

Prompt the user to decide which action you should undertake and only execute on that discrete unit of work until completion.

Once you have completed an action, use the markdown checkbox feature to indicate completion status, for example:

'''
- [ ] Incomplete Action
- [ x ] Completed Action
'''

**!IMPORTANT NOTICE!** Do not attempt to run any commandline operations at any point; only ever provide a code-block with the required command and prompt the user to run it manually.

As a positive aside, this also makes it easier to review the process as it goes along, and limits the code-butchering/colossal technical debt typically generated by copilot.

I'm digging through all of this trying to figure out exactly what I paid for. I was using Agent mode and after an hour or two I hit the "Sorry, you have exhausted the agent mode usage limit...." Okay, I think. Well I'll upgrade to Pro+ for this month and try it out to get more premium requests. So I shelled out for Pro+...but even after restarts and resigning it appears I am still hitting some kind of limit. Agent mode is really nice, but it's tempting to start looking at alternatives. It appears I paid for an upgrade that may not do anything for me if it's not actually adding any more agent mode usage. Even with all of this it seemed like it should have been unlimited even on pro by how things read.

Even on pro+ I'm hitting limits even for 4o using agent mode.

EDIT: It appears at-least the billing was prorated against the remaining time on the current billing cycle. At-least there is that.

joshsten avatar Apr 16 '25 03:04 joshsten

Thanks for your feedback. We have improved the rate limits with agent mode. Please let me know if you still hit limits starting from today (Thursday Apr 17th).

Thank you

isidorn avatar Apr 17 '25 06:04 isidorn

Thanks for your feedback. We have improved the rate limits with agent mode. Please let me know if you still hit limits starting from today (Thursday Apr 17th).

Thank you

Hi, Unfortunately, I am still experiencing the same issue "Sorry, you have exhausted the agent mode usage limit. Please switch to ask mode and try again later.". I used Agent mode for around 2 hours today (Thursday, April 17th).

EDIT : It's working again now! Maybe it just needed some rollout time. Thank you!

gomesc01d avatar Apr 17 '25 08:04 gomesc01d

Thanks - good news. Let me know if you still hit it - and if you hit it what was your scenario that triggered the limit.

isidorn avatar Apr 17 '25 12:04 isidorn

I have a pro subscription and have been using copilot for 5 hours today (Thursday, April 17th) and hit rate limiting. It is also telling me to switch to ask mode, which is also rate limited. I have been using Claude 3.5 Sonnet.

AntaeusNar avatar Apr 17 '25 20:04 AntaeusNar

I don't have a pro subscription but I am more than happy to sign up for one if it's actually unlimited, but these threads are definitely making me hesitant to pay anything at all.

RobertCharron avatar Apr 17 '25 21:04 RobertCharron