ERROR:tornado.access:503 POST /v1beta/tunedModels
Description of the bug:
Bug Report: Frequent 503 Errors with Tuned Models
Description
I have been using tuned models for a couple of months, and until the Gemini 2.0-flash release, everything worked perfectly. However, over the past two weeks, I have been experiencing a significant number of 503 errors.
These errors usually disappear after a few days but always return. Interestingly, when I switch back to a Gemini-base model (e.g., 1.5-flash or 2.0-flash), the 503 errors stop occurring.
Reproduction Steps
- Use a tuned model based on Gemini-1.5-flash.
- Run API requests as usual.
- Observe that 503 errors occur frequently.
- Switch to a Gemini-base model (e.g., 1.5-flash or 2.0-flash).
- Notice that the issue disappears.
I would greatly appreciate any insights or assistance on this matter.
Actual vs expected behavior:
Expected Behavior
Tuned models should work consistently without intermittent 503 errors.
Actual Behavior
Tuned models frequently return 503 errors, which disappear temporarily but always return.
Any other information you'd like to share?
Additional Information
- I currently have two tuned models based on Gemini-1.5-flash.
- The issue started occurring after the Gemini 2.0-flash release.
- Switching to a Gemini-base model resolves the issue.
I think we should try using exponential backoff for the 503 handling. since switching to a base model resolves the issue, theres a possibility that the tuned models face some backend issue or rate limiting. Has there been any official confirmation regarding this?
There must be some rate limiting for tuned models, but there hasn't been any official confirmation. The issue was resolved at the beginning of the month but is now happening again, which suggests that tuned models might have a monthly usage quota or rating limit.
Hi @migueltorresvalls ! Hope you are fine , May I take a look at this issue and solve this one ? If it still persist? Thank You !