cline Error while using Gemini 2.5 pro

What happened?

When using the Cline VSCode extension configured with the Google Gemini 2.5 Pro model, the error message "Provider returned error" frequently occurs without providing additional details or context. The issue temporarily resolves after clicking "Retry" several times, enabling successful processing of one or two requests, after which the error reappears.

Steps to reproduce

Configure the Cline VSCode extension with the Google Gemini 2.5 Pro model.
Initiate a chat or request via the Cline extension interface.
Observe the frequent occurrence of the "Provider returned error" message.
Click the "Retry" button several times until the request eventually succeeds.
Attempt additional requests; the error reoccurs regularly after one or two successful interactions.

Relevant API REQUEST output

Operating System

Windows 10

Cline Version

v3.8.0

Additional context

No response

Mar 29 '25 21:03 tschreiner

Thanks for reporting this issue and providing the detailed steps to reproduce.

Based on the behavior you described (frequent errors requiring retries, working for 1-2 requests then failing again) and the model you're using (Gemini 2.5 Pro Experimental), this strongly suggests you're encountering the rate limits imposed by Google on this specific experimental model tier via your API key.

As shown in the model details from Google AI Studio (like the image provided):

The model gemini-2.5-pro-exp-03-25 is clearly marked as Experimental. Experimental models often have stricter limitations and less stability than generally available ones. It has specific Rate limits. The image shows a general limit of 5 RPM, but importantly, it also shows a Free tier limit of 2 RPM (Requests Per Minute) and 50 requests per day.

It's highly likely that your usage pattern, even with just a few rapid requests or retries, is exceeding the 2 RPM limit associated with your Google API key for this free, experimental model. When you exceed this limit, Google's API returns an error. Cline receives this generic "Provider returned error" because the underlying API call failed due to the rate limit. Clicking "Retry" might eventually work once enough time has passed (e.g., > 30 seconds) for the rate limit window to allow another request.

This isn't an issue with Cline's code itself or a shared API key, but rather a limitation imposed by the provider (Google) on the specific model tier you've chosen to use with your personal API key.

Recommendations:

Switch to a More Stable Model: Try configuring Cline to use a more stable, generally available Gemini model (like Gemini 1.0 Pro or Gemini 1.5 Pro, depending on availability and your needs). These often have higher rate limits.
Check Limits in Google AI Studio: Explore the different models available in Google AI Studio and check their specific rate limits to find one that better aligns with your expected usage.
Consider Other Providers: If Google's rate limits are too restrictive for your workflow, you could configure Cline to use a different LLM provider.

Let me know if switching to a different, non-experimental Gemini model resolves the persistent errors for you.

Mar 30 '25 01:03 arafatkatze

All that being said the error display can be done in a better way so that's something that needs to be looked at.

I was able to replicate this by hitting the API rate limits myself and it looked like this

My API rate limits were hit by using a python script to purposesly hit them and the error from gemini looked like this

*** Rate limit error encountered on request 1292! Stopping. ***
Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations {
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, retry_delay {
  seconds: 19
}
]

*** Rate limit error encountered on request 1282! Stopping. ***
Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations {
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, retry_delay {
  seconds: 19
}
]

Mar 30 '25 01:03 arafatkatze

Hi,

thanks for your reply.

I suspected rate limits but I was missing a button like "Show details" or so. It wasn't apparent what has caused the failed API request.

I also should have said that I used the model via OpenRouter and not directly.

Thanks

Ara @.***> schrieb am So., 30. März 2025, 03:29:

All that being said the error display can be done in a better way so that's something that needs to be looked at.

I was able to replicate this by hitting the API rate limits myself and it looked like this

image.png (view on web) https://github.com/user-attachments/assets/d241622f-61aa-422a-a24e-3e95421d6bd2

My API rate limits were hit by using a python script to purposesly hit them and the error from gemini looked like this

*** Rate limit error encountered on request 1292! Stopping. *** Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations { } , links { description: "Learn more about Gemini API quotas" url: "https://ai.google.dev/gemini-api/docs/rate-limits" } , retry_delay { seconds: 19 } ]

*** Rate limit error encountered on request 1282! Stopping. *** Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations { } , links { description: "Learn more about Gemini API quotas" url: "https://ai.google.dev/gemini-api/docs/rate-limits" } , retry_delay { seconds: 19 } ]

— Reply to this email directly, view it on GitHub https://github.com/cline/cline/issues/2540#issuecomment-2764329372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMM2EWYCJYPCBTXH3B7JBT2W5CHHAVCNFSM6AAAAAB2BX3BHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRUGMZDSMZXGI . You are receiving this because you authored the thread.Message ID: @.***> [image: arafatkatze]arafatkatze left a comment (cline/cline#2540) https://github.com/cline/cline/issues/2540#issuecomment-2764329372

All that being said the error display can be done in a better way so that's something that needs to be looked at.

I was able to replicate this by hitting the API rate limits myself and it looked like this

image.png (view on web) https://github.com/user-attachments/assets/d241622f-61aa-422a-a24e-3e95421d6bd2

My API rate limits were hit by using a python script to purposesly hit them and the error from gemini looked like this

*** Rate limit error encountered on request 1292! Stopping. *** Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations { } , links { description: "Learn more about Gemini API quotas" url: "https://ai.google.dev/gemini-api/docs/rate-limits" } , retry_delay { seconds: 19 } ]

*** Rate limit error encountered on request 1282! Stopping. *** Error details: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations { } , links { description: "Learn more about Gemini API quotas" url: "https://ai.google.dev/gemini-api/docs/rate-limits" } , retry_delay { seconds: 19 } ]

— Reply to this email directly, view it on GitHub https://github.com/cline/cline/issues/2540#issuecomment-2764329372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMM2EWYCJYPCBTXH3B7JBT2W5CHHAVCNFSM6AAAAAB2BX3BHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRUGMZDSMZXGI . You are receiving this because you authored the thread.Message ID: @.***>

Mar 30 '25 01:03 tschreiner

It would be neat to implement some sort of automatic rate limiting feature, pulling the time to wait from the API response.

This is what my typical rate limit response from the Google API looks like: (Line breaks added for readability)

[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:streamGenerateContent?alt=sse: [429 Too Many Requests] 
You exceeded your current quota, please check your plan and billing details. 
For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. 
[{"@type":"type.googleapis.com/google.rpc.QuotaFailure",
"violations":[{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId":"GenerateRequestsPerDayPerProjectPerModel-FreeTier","quotaDimensions":
{"location":"global","model":"gemini-2.0-pro-exp"},"quotaValue":"50"}]},
{"@type":"type.googleapis.com/google.rpc.Help","links":
[{"description":"Learn more about Gemini API quotas",
"url":"https://ai.google.dev/gemini-api/docs/rate-limits"}]},
{"@type":"type.googleapis.com/google.rpc.RetryInfo","retryDelay":"4s"}]

The retryDelay variable at the end responds with the current time left to wait. We could grab that, set a timer, and automatically retry the request after the time has elapsed.

Possibly with an extra second added on just to be sure.

It could probably be adapted around to other models/APIs as well. I'm sure most APIs have a similar response to rate limiting. A new input box could be added to the model settings, allowing the user to specify the variable with the time to wait.

If I have enough time this week, I might give it a whirl (unless someone beats me to it). I'm not that great with Typescript though, so no promises! haha.

Mar 30 '25 23:03 remghoost

It would be neat to implement some sort of automatic rate limiting feature, pulling the time to wait from the API response.

This is what my typical rate limit response from the Google API looks like: (Line breaks added for readability)
[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:streamGenerateContent?alt=sse: [429 Too Many Requests] 
You exceeded your current quota, please check your plan and billing details. 
For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. 
[{"@type":"type.googleapis.com/google.rpc.QuotaFailure",
"violations":[{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId":"GenerateRequestsPerDayPerProjectPerModel-FreeTier","quotaDimensions":
{"location":"global","model":"gemini-2.0-pro-exp"},"quotaValue":"50"}]},
{"@type":"type.googleapis.com/google.rpc.Help","links":
[{"description":"Learn more about Gemini API quotas",
"url":"https://ai.google.dev/gemini-api/docs/rate-limits"}]},
{"@type":"type.googleapis.com/google.rpc.RetryInfo","retryDelay":"4s"}]
The retryDelay variable at the end responds with the current time left to wait. We could grab that, set a timer, and automatically retry the request after the time has elapsed.

Possibly with an extra second added on just to be sure.

It could probably be adapted around to other models/APIs as well. I'm sure most APIs have a similar response to rate limiting. A new input box could be added to the model settings, allowing the user to specify the variable with the time to wait.

If I have enough time this week, I might give it a whirl (unless someone beats me to it). I'm not that great with Typescript though, so no promises! haha.

It should be more scalable an RPM config option, so cline do the wait, maybe I want to run a max X requests per minute without caring about model.

Apr 15 '25 05:04 axellpadilla

same error i did try different api key and still same error

Apr 20 '25 11:04 myudak

same error i did try different api key and still same error

ratelimits are based on the project, not the api keys.

Apr 21 '25 14:04 arin2115

I ran into the same 429 error when using Gemini 2.5 Pro, but not while using Gemini 2.5 Flash.

The issue was that the API key I created at https://aistudio.google.com/apikey was linked to a Google Cloud Console project which was not linked to a billing account.

To link it, open Cloud Console, select the project, and click the "Billing" section. If it's not yet linked, it will ask you to link it to an existing billing account.

Jun 02 '25 14:06 slhck

I error: GoogleGenerativeAIFetchError: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:generateContent: [429 ] You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [{"@type":"type.googleapis.com/google.rpc.QuotaFailure","violations":[{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count","quotaId":"GenerateContentInputTokensPerModelPerMinute-FreeTier","quotaDimensions":{"location":"global","model":"gemini-2.0-pro-exp"}},{"quotaMetric":"generativelanguage.googleapis.com/generate_requests_per_model_per_day","quotaId":"GenerateRequestsPerDayPerProjectPerModel"},{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests","quotaId":"GenerateRequestsPerMinutePerProjectPerModel-FreeTier","quotaDimensions":{"model":"gemini-2.0-pro-exp","location":"global"}},{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_requests","quotaId":"GenerateRequestsPerDayPerProjectPerModel-FreeTier","quotaDimensions":{"location":"global","model":"gemini-2.0-pro-exp"}},{"quotaMetric":"generativelanguage.googleapis.com/generate_content_free_tier_input_token_count","quotaId":"GenerateContentInputTokensPerModelPerDay-FreeTier","quotaDimensions":{"location":"global","model":"gemini-2.0-pro-exp"}}]},{"@type":"type.googleapis.com/google.rpc.Help","links":[{"description":"Learn more about Gemini API quotas","url":"https://ai.google.dev/gemini-api/docs/rate-limits"}]},{"@type":"type.googleapis.com/google.rpc.RetryInfo","retryDelay":"9s"}] at handleResponseNotOk (@google_generative-ai.js?v=e8b28834:226:9) at async makeRequest (@google_generative-ai.js?v=e8b28834:199:5) at async generateContent (@google_generative-ai.js?v=e8b28834:544:20) at async ChatSession.sendMessage (@google_generative-ai.js?v=e8b28834:802:5) at async OnGenerateTrip (index.jsx:102:22)

Jun 06 '25 09:06 sonukumarsaw12

I have the same error when change from gemini to qwen and try to use qwen:

404 The model `gemini-2.5-pro` does not exist or you do not have access to it.
Request ID: 0261327e-47b9-96a3-920f-4895030a7ce6

Jul 27 '25 00:07 NightZpy

I'm Try using Gemini 2.5 pro, at a random time, it will cause issue 500

{"@timestamp":"2025-08-15T07:03:45Z","level":"error","message":"Gemini API error response: {\"error\":{\"code\":500,\"message\":\"An internal error has occurred. Please retry or report in https:\\/\\/developers.generativeai.google\\/guide\\/troubleshooting\",\"status\":\"INTERNAL\"}}","pid":-31656,"service":"aichat"}

While it's fine every time i use gemini 2.5 flash, any solution for this? It's so frustrating when i want to use gemini 2.5 pro, but it results in an error at a random time

Aug 15 '25 09:08 Dimasilham7

@Dimasilham7 Its just a rate limit issue on the end of gemini and there's not much we can do to help there.

WE do have automated retries for Gemini but that's about it

Aug 15 '25 17:08 arafatkatze

@Dimasilham7 Its just a rate limit issue on the end of gemini and there's not much we can do to help there.

WE do have automated retries for Gemini but that's about it

Hi, thanks for answering, so the issue is on gemini correct?

Aug 15 '25 17:08 Dimasilham7

Yes, Gemini has strange rate limits which vary widely. There's really not much we can do besides retry requests.

Aug 15 '25 18:08 arafatkatze