feat(modelAvailabilityService): integrate model availability service into backend logic
Summary
This PR refactors the model fallback mechanism to be driven by the Model Availability Service. This introduces a more robust, policy-based approach for handling model failures. All functionality is gated by an experimental flag.
Details
The previous fallback logic was tightly coupled with specific model names and error types. By integrating the Model Availability Service, we can now define flexible policies for model selection, retries, and fallbacks in centralized way.
geminiChat.ts:- Integrates
selectModelForAvailabilityto choose the best available model based on policy. - Dynamically updates
generateContentConfigif the selected model differs or changes during retries.
- Integrates
retry.ts:retryWithBackoffnow accepts agetAvailabilityContextcallback to apply state transitions (e.g.,terminal,transient) based on failure kind.- Marks models as healthy upon success.
handler.ts:- Splits logic into legacy and policy-driven paths (gated by
isModelAvailabilityServiceEnabled). - Fallback intents like
retry_alwaysnow useconfig.setActiveModel().
Related Issues
Fixes: https://github.com/google-gemini/maintainers-gemini-cli/issues/1079
How to Validate
npm run preflight
Pre-Merge Checklist
- [ ] Updated relevant documentation and README (if needed)
- [ ] Added/updated tests (if needed)
- [ ] Noted breaking changes (if any)
- [ ] Validated on required platforms/methods:
- [ ] MacOS
- [x] npm run
- [ ] npx
- [ ] Docker
- [ ] Podman
- [ ] Seatbelt
- [ ] Windows
- [ ] npm run
- [ ] npx
- [ ] Docker
- [ ] Linux
- [ ] npm run
- [ ] npx
- [ ] Docker
- [ ] MacOS
Summary of Changes
Hello @adamfweidman, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a significant refactoring of the model fallback mechanism, transitioning from a tightly coupled, error-type-specific approach to a more robust, policy-driven system powered by a new Model Availability Service. This change centralizes model selection, retries, and fallbacks, allowing for flexible policies to manage model health and ensure a smoother user experience, all gated by an experimental flag.
Highlights
- Policy-Driven Model Fallback: The core model fallback mechanism has been refactored to be driven by a new Model Availability Service, enabling flexible, policy-based model selection and retry strategies.
- Dynamic Model Selection & Configuration: The system can now dynamically select the best available model based on defined policies and update the generation configuration mid-retry if a fallback occurs, ensuring a smoother user experience.
- Enhanced Error Classification: A new
errorClassification.tsmodule provides a standardized way to categorize API errors (e.g., terminal, transient, not_found), which informs the model availability policies. - Centralized Model State Management: The
Configclass now includes an_activeModelproperty and methods (setActiveModel,resetTurn) to manage the currently active model and interact with the availability service. - Integration into Core Components: The Model Availability Service is integrated into
BaseLlmClient,GeminiClient, andGeminiChatto ensure consistent policy application across different content generation flows.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.
Size Change: +8.83 kB (+0.04%)
Total Size: 21.5 MB
| Filename | Size | Change |
|---|---|---|
./bundle/gemini.js |
21.5 MB | +8.83 kB (+0.04%) |
ℹ️ View Unchanged
| Filename | Size |
|---|---|
./bundle/sandbox-macos-permissive-closed.sb |
1.03 kB |
./bundle/sandbox-macos-permissive-open.sb |
890 B |
./bundle/sandbox-macos-permissive-proxied.sb |
1.31 kB |
./bundle/sandbox-macos-restrictive-closed.sb |
3.29 kB |
./bundle/sandbox-macos-restrictive-open.sb |
3.36 kB |
./bundle/sandbox-macos-restrictive-proxied.sb |
3.56 kB |
/gemini review