When reaching quota for gemini-3-pro-preview switching to other model was not done and request ended
What happened?
Here I tried request, and after error again.
It still redirects to http://localhost:4200/404 Does it have to be also in blogConfig.json?
Responding with gemini-3-pro-preview ✕ [API Error: [{ "error": { "code": 429, "message": "Resource has been exhausted (e.g. check quota).", "errors": [ { "message": "Resource has been exhausted (e.g. check quota).", "domain": "global", "reason": "rateLimitExceeded" } ], "status": "RESOURCE_EXHAUSTED" } } ]] Possible quota limitations in place or slow response times detected. Switching to the gemini-2.5-flash model for the rest of this session.
It still redirects to http://localhost:4200/404 Does it have to be also in blogConfig.json?
Responding with gemini-3-pro-preview ✕ [API Error: [{ "error": { "code": 429, "message": "Resource has been exhausted (e.g. check quota).", "errors": [ { "message": "Resource has been exhausted (e.g. check quota).", "domain": "global", "reason": "rateLimitExceeded" } ], "status": "RESOURCE_EXHAUSTED" } } ]] Possible quota limitations in place or slow response times detected. Switching to the gemini-2.5-flash model for the rest of this session.
What did you expect to happen?
I expected first request to continue with lowered model, but instead request ended. Still I expected lowered model to be used, but when I made same request same error happened
Client information
- CLI Version: 0.20.0
- Git Commit: d0ce3c4c5
- Session ID: dacfee0a-f1cb-499b-bcbc-a858194fbb4f
- Operating System: win32 v22.9.0
- Sandbox Environment: no sandbox
- Model Version: auto
- Memory Usage: 169.7 MB
Login information
No response
Anything else we need to know?
In /model I am using Auto
There is a problem with the Gemini CLI issue triaging. Please check the action logs for details.
I'm paying for gemini 3 yet I can't get a single task out of it via gemini-cli before it runs out of quota? I'm not even convinced i'm getting gemini 3 model, it's masked under the opaque name 'pro', stats show that i had 13 requests of gemini-3-pro-preview. It's now gelling me to switch to gemini 2.5 flash? It's nuts that I can't access the model I'm paying for?
I have this issue too, using nightly builds. Responding with "continue" usually works, it's a response time error it seems.
@alrightryanx writing "continue" doesn't work for me...
> Please create a python script in sbin that accepts a parameter '--target-language-code' with an ISO-639 code, which will then read all of stdin and translate it to the target language using the
Google Cloud language translation API, and output the translated text on stdout.
✦ Got it. I will create a Python script at sbin/translate.py that reads from stdin, translates the text to the language specified by the --target-language-code parameter using the Google Cloud
Translation API, and prints the translation to stdout.
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓ WriteFile Writing to sbin/translate.py │
│ │
│ 1 #!/usr/bin/env python3 │
│ 2 │
│ 3 import argparse │
│ 4 import sys │
│ 5 from google.cloud import translate_v2 as translate │
│ 6 │
│ 7 def translate_text(text, target_language): │
│ 8 """Translates text into the target language.""" │
│ 9 translate_client = translate.Client() │
│ 10 result = translate_client.translate(text, target_language=target_language) │
│ 11 return result['translatedText'] │
│ 12 │
│ 13 if __name__ == '__main__': │
│ 14 parser = argparse.ArgumentParser(description='Translate text from stdin to a target language.') │
│ 15 parser.add_argument('--target-language-code', required=True, help='ISO-639 code of the language to translate to.') │
│ 16 args = parser.parse_args() │
│ 17 │
│ 18 input_text = sys.stdin.read() │
│ 19 translated_text = translate_text(input_text, args.target_language_code) │
│ 20 print(translated_text) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
✕ [API Error: {"error":{"message":"{\n \"error\": {\n \"code\": 429,\n \"message\": \"Resource has been exhausted (e.g. check quota).\",\n \"status\": \"RESOURCE_EXHAUSTED\"\n
}\n}\n","code":429,"status":"Too Many Requests"}}]
Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method
> continue
✕ [API Error: {"error":{"message":"{\n \"error\": {\n \"code\": 429,\n \"message\": \"Resource has been exhausted (e.g. check quota).\",\n \"status\": \"RESOURCE_EXHAUSTED\"\n
}\n}\n","code":429,"status":"Too Many Requests"}}]
Please wait and try again later. To increase your limits, request a quota increase through AI Studio, or switch to another /auth method
The full text of the system instruction (4,375 tokens) is re-uploading with every single API request. This causes the application to hit the Tokens Per Minute quota almost instantly, even if Requests Per Minute count is low.
Worked for a while today in 0.21.3, about 2 hours before I started getting the same error again.