gemini-cli Free Tier incorrectly reports "you've exhausted your daily usage on model" for all requests (possibly VPN-related)

What happened?

Gemini CLI refuses to run any prompts, even simple text-only ones, and always returns the error:

User:  can you help me refactor
Responding with gemini-2.5-pro
✕  [API Error: You have exhausted your daily quota on this model.]
[INSERT]  auto (100% context left)  |  141.2 MB |  ✖ 9 errors (F12 for details)

This happens even when the Free Tier usage should be at 0% (new day, no requests made). The issue occurs whether or not I use file-based prompts (@filename). For example, both of the following fail:

"hello" and @input.txt "analyze this"

However, the service used to work normally under the same environment and VPN configuration. Now it consistently rejects all requests with the same error.

What did you expect to happen?

Expected Behavior: Requests should run normally when daily usage has not been exceeded. If the issue is related to network/VPN restrictions, the error message should indicate that, instead of incorrectly reporting quota exhaustion.

Actual Behavior: All requests return "you've exhausted your daily usage on model" regardless of actual usage or prompt type.

Client information

CLI Version: 0.20.0
Git Commit: d0ce3c4c5
Session ID: 0ac88306-2c76-4b19-b609-2b488ef3a551
Operating System: darwin v22.17.0
Sandbox Environment: no sandbox
Model Version: auto
Memory Usage: 97.6 MB

Login information

API key VPN is the same country that key is generated Steps to Reproduce:

Have an active Google account with Free Tier access (usage 0 for the day).

Install and authenticate the Gemini CLI normally Run any request, for example:

gemini "test"

The CLI responds with: you've exhausted your daily usage on model

Anything else we need to know?

re-entering the key doesn't work storing the API in shell env (GEMINI_API_KEY=) the key is correct in application. it just doesn't work from version 18.0 or somewhere around it

Dec 12 '25 10:12 sashakosti

https://github.com/google-gemini/gemini-cli/issues/14966

simmillar issue

Dec 12 '25 10:12 sashakosti

hi @sashakosti I would love to work on this issue, please assign

Dec 12 '25 11:12 ishaanxgupta

I also confirm the problem: even a daily abstinence doesn't reset the quota overflow. Every first request of a new day overflows the daily quota.

Dec 12 '25 19:12 SpectatorLife

Same issue. Even new api returns this error

Dec 15 '25 02:12 c0o1sy1z3

same issue here

Dec 15 '25 09:12 robert4948

The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.

Dec 15 '25 22:12 ya-john

The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.

Where can I see it and how can I reset it?

Dec 16 '25 07:12 SpectatorLife

Is everyone on this thread running with GEMINI_API_KEY authentication? Or is OAuth authentication also producing similar issues?

Dec 17 '25 01:12 gsquared94

Is everyone on this thread running with GEMINI_API_KEY authentication? Or is OAuth authentication also producing similar issues?

I also use Use Gemini API Key

Dec 17 '25 20:12 SpectatorLife

When will this bug be fixed？It is really inconvenient.

Dec 18 '25 06:12 c0o1sy1z3

When will this bug be fixed？It is really inconvenient.

The quota errors from GEMINI_API_KEY users originate from the underlying generativelanguage API. It doesn't seem like a CLI issue.

However, can you try running this sample code directly to verify that the key works? After how many runs does it start failing? https://ai.google.dev/gemini-api/docs/api-key#provide-api-key-explicitly

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent" \
  -H 'Content-Type: application/json' \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works in a few words"
          }
        ]
      }
    ]
  }'

Dec 18 '25 06:12 gsquared94

When will this bug be fixed？It is really inconvenient.

The quota errors from GEMINI_API_KEY users originate from the underlying generativelanguage API. It doesn't seem like a CLI issue.

However, can you try running this sample code directly to verify that the key works? After how many runs does it start failing? https://ai.google.dev/gemini-api/docs/api-key#provide-api-key-explicitly
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent" \
  -H 'Content-Type: application/json' \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works in a few words"
          }
        ]
      }
    ]
  }'

{
  "error": {
    "code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\nPlease retry in 44.665768494s.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Learn more about Gemini API quotas",
            "url": "https://ai.google.dev/gemini-api/docs/rate-limits"
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
            "quotaId": "GenerateRequestsPerDayPerProjectPerModel-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
            "quotaId": "GenerateRequestsPerMinutePerProjectPerModel-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerMinute-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.RetryInfo",
        "retryDelay": "44s"
      }
    ]
  }
}

Dec 18 '25 08:12 c0o1sy1z3

Since you're exhausting the GenerateContentInputTokensPerModelPerDay-FreeTier quota of the generativelanguage API itself, I'm afraid there isn't anything the GeminiCLI can do. You will have to try a different key, or a different auth method.

Dec 18 '25 21:12 gsquared94

So what should we do with the basic generativelanguage API? And who exhausted it if we didn't make any requests? Did they reset the free quota? Just say so, or will you fix it, but as part of a different task? Or is there some general key quota beyond the daily one?

Dec 18 '25 22:12 SpectatorLife

I got the same result with free tier API key. Look at the response message:

Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro

I guess the "limit: 0" makes it impossible to do anything with the API key.

Dec 19 '25 01:12 h880015

It seems I have found the solution: enable the settings for Gemini-3.

A few hypotheses: Google wants free-tier users to help test Gemini-3, rather than using Gemini-2.5 for continuous work. (Freeloading is unacceptable; testing and trials are welcome.)

A few clues about the free-tier API：

Google has removed Gemini 2.5-Pro and restricted Gemini 2.5-Flash.
We cannot find 2.5-Pro on the usage data page, and the RPM for 2.5-Flash is only 5, while that of Gemini-3 is 30.
We won't be able to use CLI with Gemini-3 disabled. (Just like this issue)

Dec 19 '25 03:12 c0o1sy1z3

I tried using "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-preview:generateContent" for accessing gemini-3-pro but the result is the same: 429 error with limit: 0.

(Gemini 3 Pro was enabled when I first got usage exhausted error. Then I switched to Gemini 2.5 Pro but in vain.)

Dec 19 '25 06:12 h880015

{
  "error": {
    "code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\n* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro\nPlease retry in 6.350918213s.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Learn more about Gemini API quotas",
            "url": "https://ai.google.dev/gemini-api/docs/rate-limits"
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerDay-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_input_token_count",
            "quotaId": "GenerateContentInputTokensPerModelPerMinute-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
            "quotaId": "GenerateRequestsPerMinutePerProjectPerModel-FreeTier",
            "quotaDimensions": {
              "model": "gemini-2.5-pro",
              "location": "global"
            }
          },
          {
            "quotaMetric": "generativelanguage.googleapis.com/generate_content_free_tier_requests",
            "quotaId": "GenerateRequestsPerDayPerProjectPerModel-FreeTier",
            "quotaDimensions": {
              "location": "global",
              "model": "gemini-2.5-pro"
            }
          }
        ]
      },
      {
        "@type": "type.googleapis.com/google.rpc.RetryInfo",
        "retryDelay": "6s"
      }
    ]
  }
}

same response, freshly generated API key, old VPN connection

Dec 19 '25 18:12 sashakosti