Free Tier incorrectly reports "you've exhausted your daily usage on model" for all requests (possibly VPN-related)
What happened?
Gemini CLI refuses to run any prompts, even simple text-only ones, and always returns the error:
User: can you help me refactor
Responding with gemini-2.5-pro
✕ [API Error: You have exhausted your daily quota on this model.]
[INSERT] auto (100% context left) | 141.2 MB | ✖ 9 errors (F12 for details)
This happens even when the Free Tier usage should be at 0% (new day, no requests made). The issue occurs whether or not I use file-based prompts (@filename). For example, both of the following fail:
"hello"
and
@input.txt "analyze this"
However, the service used to work normally under the same environment and VPN configuration. Now it consistently rejects all requests with the same error.
What did you expect to happen?
Expected Behavior: Requests should run normally when daily usage has not been exceeded. If the issue is related to network/VPN restrictions, the error message should indicate that, instead of incorrectly reporting quota exhaustion.
Actual Behavior: All requests return "you've exhausted your daily usage on model" regardless of actual usage or prompt type.
Client information
- CLI Version: 0.20.0
- Git Commit: d0ce3c4c5
- Session ID: 0ac88306-2c76-4b19-b609-2b488ef3a551
- Operating System: darwin v22.17.0
- Sandbox Environment: no sandbox
- Model Version: auto
- Memory Usage: 97.6 MB
Login information
API key VPN is the same country that key is generated Steps to Reproduce:
Have an active Google account with Free Tier access (usage 0 for the day).
Install and authenticate the Gemini CLI normally Run any request, for example:
gemini "test"
The CLI responds with: you've exhausted your daily usage on model
Anything else we need to know?
re-entering the key doesn't work
storing the API in shell env (GEMINI_API_KEY=
https://github.com/google-gemini/gemini-cli/issues/14966
simmillar issue
hi @sashakosti I would love to work on this issue, please assign
I also confirm the problem: even a daily abstinence doesn't reset the quota overflow. Every first request of a new day overflows the daily quota.
Same issue. Even new api returns this error
same issue here
The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.
The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.
Where can I see it and how can I reset it?
Is everyone on this thread running with GEMINI_API_KEY authentication? Or is OAuth authentication also producing similar issues?
Is everyone on this thread running with
GEMINI_API_KEYauthentication? Or is OAuth authentication also producing similar issues?
I also use Use Gemini API Key
When will this bug be fixed?It is really inconvenient.
When will this bug be fixed?It is really inconvenient.
The quota errors from GEMINI_API_KEY users originate from the underlying generativelanguage API. It doesn't seem like a CLI issue.
However, can you try running this sample code directly to verify that the key works? After how many runs does it start failing? https://ai.google.dev/gemini-api/docs/api-key#provide-api-key-explicitly
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent" \
-H 'Content-Type: application/json' \
-H "x-goog-api-key: YOUR_API_KEY" \
-X POST \
-d '{
"contents": [
{
"parts": [
{
"text": "Explain how AI works in a few words"
}
]
}
]
}'
When will this bug be fixed?It is really inconvenient.
The quota errors from
GEMINI_API_KEYusers originate from the underlyinggenerativelanguageAPI. It doesn't seem like a CLI issue.However, can you try running this sample code directly to verify that the key works? After how many runs does it start failing? https://ai.google.dev/gemini-api/docs/api-key#provide-api-key-explicitly
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent" \ -H 'Content-Type: application/json' \ -H "x-goog-api-key: YOUR_API_KEY" \ -X POST \ -d '{ "contents": [ { "parts": [ { "text": "Explain how AI works in a few words" } ] } ] }'
{
"error": {
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\nPlease retry in 44.665768494s.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Learn more about Gemini API quotas",
"url": "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
"quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId": "GenerateRequestsPerDayPerProjectPerModel-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId": "GenerateRequestsPerMinutePerProjectPerModel-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
"quotaId": "GenerateContentInputTokensPerModelPerMinute-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
}
]
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "44s"
}
]
}
}
Since you're exhausting the GenerateContentInputTokensPerModelPerDay-FreeTier quota of the generativelanguage API itself, I'm afraid there isn't anything the GeminiCLI can do. You will have to try a different key, or a different auth method.
So what should we do with the basic generativelanguage API? And who exhausted it if we didn't make any requests? Did they reset the free quota? Just say so, or will you fix it, but as part of a different task? Or is there some general key quota beyond the daily one?
I got the same result with free tier API key. Look at the response message:
Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro
I guess the "limit: 0" makes it impossible to do anything with the API key.
It seems I have found the solution: enable the settings for Gemini-3.
A few hypotheses: Google wants free-tier users to help test Gemini-3, rather than using Gemini-2.5 for continuous work. (Freeloading is unacceptable; testing and trials are welcome.)
A few clues about the free-tier API:
- Google has removed Gemini 2.5-Pro and restricted Gemini 2.5-Flash.
- We cannot find 2.5-Pro on the usage data page, and the RPM for 2.5-Flash is only 5, while that of Gemini-3 is 30.
- We won't be able to use CLI with Gemini-3 disabled. (Just like this issue)
I tried using "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-preview:generateContent" for accessing gemini-3-pro but the result is the same: 429 error with limit: 0.
(Gemini 3 Pro was enabled when I first got usage exhausted error. Then I switched to Gemini 2.5 Pro but in vain.)
{
"error": {
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\nPlease retry in 6.350918213s.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Learn more about Gemini API quotas",
"url": "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
"quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
"quotaId": "GenerateContentInputTokensPerModelPerMinute-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId": "GenerateRequestsPerMinutePerProjectPerModel-FreeTier",
"quotaDimensions": {
"model": "gemini-2.5-pro",
"location": "global"
}
},
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
"quotaId": "GenerateRequestsPerDayPerProjectPerModel-FreeTier",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-pro"
}
}
]
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "6s"
}
]
}
}
same response, freshly generated API key, old VPN connection