gemini-cli When reaching quota for gemini-3-pro-preview switching to other model was not done and request ended

What happened?

Here I tried request, and after error again.

It still redirects to http://localhost:4200/404 Does it have to be also in blogConfig.json?

Responding with gemini-3-pro-preview ✕ [API Error: [{ "error": { "code": 429, "message": "Resource has been exhausted (e.g. check quota).", "errors": [ { "message": "Resource has been exhausted (e.g. check quota).", "domain": "global", "reason": "rateLimitExceeded" } ], "status": "RESOURCE_EXHAUSTED" } } ]] Possible quota limitations in place or slow response times detected. Switching to the gemini-2.5-flash model for the rest of this session.

It still redirects to http://localhost:4200/404 Does it have to be also in blogConfig.json?

Responding with gemini-3-pro-preview ✕ [API Error: [{ "error": { "code": 429, "message": "Resource has been exhausted (e.g. check quota).", "errors": [ { "message": "Resource has been exhausted (e.g. check quota).", "domain": "global", "reason": "rateLimitExceeded" } ], "status": "RESOURCE_EXHAUSTED" } } ]] Possible quota limitations in place or slow response times detected. Switching to the gemini-2.5-flash model for the rest of this session.

What did you expect to happen?

I expected first request to continue with lowered model, but instead request ended. Still I expected lowered model to be used, but when I made same request same error happened

Client information

CLI Version: 0.20.0
Git Commit: d0ce3c4c5
Session ID: dacfee0a-f1cb-499b-bcbc-a858194fbb4f
Operating System: win32 v22.9.0
Sandbox Environment: no sandbox
Model Version: auto
Memory Usage: 169.7 MB

Login information

No response

Anything else we need to know?

In /model I am using Auto

Dec 10 '25 13:12 pipik-roman

There is a problem with the Gemini CLI issue triaging. Please check the action logs for details.

Dec 10 '25 13:12 gemini-cli[bot]

I'm paying for gemini 3 yet I can't get a single task out of it via gemini-cli before it runs out of quota? I'm not even convinced i'm getting gemini 3 model, it's masked under the opaque name 'pro', stats show that i had 13 requests of gemini-3-pro-preview. It's now gelling me to switch to gemini 2.5 flash? It's nuts that I can't access the model I'm paying for?

Dec 10 '25 15:12 jimmyff

I have this issue too, using nightly builds. Responding with "continue" usually works, it's a response time error it seems.

Dec 10 '25 16:12 alrightryanx

@alrightryanx writing "continue" doesn't work for me...

> Please create a python script in sbin that accepts a parameter '--target-language-code' with an ISO-639 code, which will then read all of stdin and translate it to the target language using the
  Google Cloud language translation API, and output the translated text on stdout.

✦ Got it. I will create a Python script at sbin/translate.py that reads from stdin, translates the text to the language specified by the --target-language-code parameter using the Google Cloud
  Translation API, and prints the translation to stdout.
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓  WriteFile Writing to sbin/translate.py                                                                                                                                                            │
│                                                                                                                                                                                                      │
│  1 #!/usr/bin/env python3                                                                                                                                                                            │
│  2                                                                                                                                                                                                   │
│  3 import argparse                                                                                                                                                                                   │
│  4 import sys                                                                                                                                                                                        │
│  5 from google.cloud import translate_v2 as translate                                                                                                                                                │
│  6                                                                                                                                                                                                   │
│  7 def translate_text(text, target_language):                                                                                                                                                        │
│  8     """Translates text into the target language."""                                                                                                                                               │
│  9     translate_client = translate.Client()                                                                                                                                                         │
│ 10     result = translate_client.translate(text, target_language=target_language)                                                                                                                    │
│ 11     return result['translatedText']                                                                                                                                                               │
│ 12                                                                                                                                                                                                   │
│ 13 if __name__ == '__main__':                                                                                                                                                                        │
│ 14     parser = argparse.ArgumentParser(description='Translate text from stdin to a target language.')                                                                                               │
│ 15     parser.add_argument('--target-language-code', required=True, help='ISO-639 code of the language to translate to.')                                                                            │
│ 16     args = parser.parse_args()                                                                                                                                                                    │
│ 17                                                                                                                                                                                                   │
│ 18     input_text = sys.stdin.read()                                                                                                                                                                 │
│ 19     translated_text = translate_text(input_text, args.target_language_code)                                                                                                                       │
│ 20     print(translated_text)                                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"Resource has been exhausted (e.g. check quota).\",\n    \"status\": \"RESOURCE_EXHAUSTED\"\n
  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method


> continue

✕ [API Error: {"error":{"message":"{\n  \"error\": {\n    \"code\": 429,\n    \"message\": \"Resource has been exhausted (e.g. check quota).\",\n    \"status\": \"RESOURCE_EXHAUSTED\"\n
  }\n}\n","code":429,"status":"Too Many Requests"}}]
  Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method

Dec 13 '25 16:12 hedges333

The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.

Dec 15 '25 22:12 ya-john

Worked for a while today in 0.21.3, about 2 hours before I started getting the same error again.

Dec 19 '25 19:12 hedges333