gemini-cli icon indicating copy to clipboard operation
gemini-cli copied to clipboard

feat: Plan mode

Open linegel opened this issue 1 month ago • 17 comments

Summary

Implements plan mode feature that transforms Gemini CLI from an execution agent into a planning assistant. Users toggle between Agent Mode (default) and Plan Mode using the /plan slash command. In plan mode, the model focuses on analysis and planning without executing destructive operations, enabling safe codebase exploration and change review before execution.

This addresses the core need from issue #4666 to "force the model to think very hard through a problem" by blocking write tools and requiring comprehensive planning before any system modifications occur.

Closes #4666

Details

Implementation Approach:

This implementation uses hard enforcement through tool registry filtering, ensuring destructive tools are never declared to the model in plan mode. This architectural approach makes it impossible for the model to attempt destructive operations, going beyond prompt-only solutions.

Core Architecture:

  1. Tool Registry Filtering (Primary Enforcement):

    • packages/core/src/tools/tool-types.ts: New file classifying tools as destructive vs read-only
    • packages/core/src/tools/tool-registry.ts: Modified getFunctionDeclarations() to filter destructive tools when isPlanMode === true
    • Destructive tools (replace, write_file, write_todos, run_shell_command) not declared to model in plan mode
    • MCP and discovered tools completely blocked (cannot verify safety of external/custom tools)
  2. Defense-in-Depth (Secondary Protection):

    • packages/cli/src/ui/hooks/useGeminiStream.ts: Blocks tool execution and displays calls as JSON if model attempts tool use
    • Provides failsafe if registry filtering fails
    • Offers transparency by showing what model intended to do
  3. State Management:

    • packages/core/src/config/config.ts: Added isPlanMode state with getter/setter
    • packages/cli/src/ui/contexts/UIStateContext.tsx: UI state tracking
    • Synchronized between core config and UI state via useGeminiStream hook
  4. System Prompts:

    • packages/core/src/core/prompts.ts: New getPlanModeSystemPrompt() instructs model on planning behavior
    • packages/core/src/core/client.ts: Conditional prompt selection in startChat()
    • Emphasizes analysis, explanation, and step-by-step planning
  5. User Interface:

    • packages/cli/src/ui/commands/planCommand.ts: New /plan slash command for toggling
    • packages/cli/src/ui/components/Footer.tsx: Visual indicator showing current mode ("Plan Mode" | "Agent Mode")
    • packages/cli/src/ui/AppContainer.tsx: Central state orchestration

Tool Availability Matrix:

Tool Category Plan Mode Agent Mode
read_file, read_many_files ✅ Available ✅ Available
search_file_content, glob, list_directory ✅ Available ✅ Available
save_memory, web_fetch, google_web_search ✅ Available ✅ Available
replace, write_file, write_todos ❌ Blocked ✅ Available
run_shell_command ❌ Blocked ✅ Available
MCP server tools ❌ Blocked ✅ Available
Discovered tools ❌ Blocked ✅ Available

Design Decisions:

  • Simple toggle vs keyboard shortcut: /plan command integrates naturally with existing slash command system
  • No exit_plan_mode tool: Simple toggle approach is clearer (same command enters/exits)
  • Registry-level filtering: Prevents model from knowing destructive tools exist, rather than relying on prompts
  • Block all MCP/discovered tools: Conservative safety approach since we cannot verify external tool behavior

Documentation:

  • docs/plan-mode.md: Comprehensive guide covering features, usage, technical implementation, best practices, and future enhancements roadmap

Related Issues

Closes #4666

Continues #1742

How to Validate

1. Basic Toggle Test:

npm run build
npm start

# In Gemini CLI:
/plan
# Expected: Footer shows "Plan Mode |"
# Expected: Info message confirms "Plan mode enabled"

/plan
# Expected: Footer shows "Agent Mode |"
# Expected: Info message confirms "Plan mode disabled"

2. Tool Filtering Verification (Plan Mode):

# In Plan Mode, try:
"Edit the README to add a new section"

# Expected behavior:
# - Model describes what it would do but doesn't execute
# - Explains the changes it would make
# - Provides step-by-step implementation plan
# - No actual file modifications occur

3. Tool Filtering Verification (Agent Mode):

# Toggle back to Agent Mode, then:
"Edit the README to add a new section"

# Expected behavior:
# - Model actually executes edit tool
# - README file is modified
# - Changes are reflected on disk

4. Read-Only Tools Still Work (Plan Mode):

# In Plan Mode:
"Read the package.json and tell me the dependencies"

# Expected: Works normally, displays dependencies

5. MCP/Discovered Tools Blocked (Plan Mode):

# If you have MCP servers configured:
# 1. Toggle to Plan Mode
# 2. Try to use an MCP tool
# Expected: Tool not available to model, model explains it cannot use MCP tools in plan mode

# 3. Toggle to Agent Mode
# Expected: MCP tools work normally

6. Test Coverage:

npm run test
# Expected: All 3227 tests pass

7. Preflight Check:

npm run preflight
# Expected: All checks pass (ESLint, Prettier, tests, formatting)

8. Edge Cases:

# Test mid-session toggle behavior:
# 1. Start in Agent Mode
# 2. Have ongoing conversation
# 3. Toggle to Plan Mode
# 4. Ask for file modifications
# Expected: Model switches to planning mode, no execution

# Test state synchronization:
# 1. Toggle mode via /plan command
# 2. Check Footer displays correct mode
# 3. Verify config.getIsPlanMode() returns correct state

Pre-Merge Checklist

  • [x] Updated relevant documentation and README (if needed)
  • [x] Added/updated tests (if needed)
  • [x] Noted breaking changes (if any) - No breaking changes

Testing Matrix

Platform npm run npx Docker Podman Seatbelt
macOS
Windows N/A
Linux N/A

Usage example

image

linegel avatar Nov 19 '25 05:11 linegel

Summary of Changes

Hello @linegel, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a significant new feature called 'Plan Mode' for the Gemini CLI. Its primary purpose is to transform the CLI's behavior from an execution agent to a planning assistant, enabling users to safely explore codebases and develop detailed implementation strategies without the risk of unintended system modifications. This is achieved through a multi-layered approach involving tool filtering at the core level, UI-based execution blocking, and tailored system prompts, ensuring that the model focuses purely on analysis and planning when the mode is active.

Highlights

  • New Plan Mode Feature: Introduces a 'Plan Mode' to the Gemini CLI, allowing the model to act as a planning assistant without executing destructive operations. This mode is toggled using the /plan slash command.
  • Hard Enforcement of Tool Restrictions: Destructive tools (like file modification or shell execution) are filtered at the tool registry level when in Plan Mode, preventing the model from even being aware of them. This ensures a robust safety mechanism.
  • Defense-in-Depth UI Protection: As a secondary safeguard, the UI blocks the execution of any tool calls in Plan Mode, instead displaying them as JSON messages. This provides transparency and prevents unintended actions.
  • Dedicated System Prompts: The model receives specialized system prompts when in Plan Mode, instructing it to focus on analysis, planning, and explanation, rather than direct implementation.
  • UI Indicators and Documentation: The CLI footer now displays the current mode ('Plan Mode' or 'Agent Mode'), and comprehensive documentation (docs/plan-mode.md) has been added to explain the feature, its usage, and technical details.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot] avatar Nov 19 '25 05:11 gemini-code-assist[bot]

Addressing @vmiura feedback from the original PR #1742:

Non-destructive tools: Fixed. Read-only tools (read_file, grep, glob, etc.) are available in plan mode via registry-level filtering. The model can read files and search to create plans

Mid-session mode awareness: Known limitation. System prompt updates only on chat start. Workaround: run /clear after toggling modes. This can be improved in a follow-up PR to dynamically update the system prompt

The core implementation uses hard enforcement through tool registry filtering rather than prompt-only restrictions, preventing the model from knowing destructive tools exist in plan mode

linegel avatar Nov 19 '25 05:11 linegel

Thank you for taking this over. I had started https://github.com/google-gemini/gemini-cli/pull/7193 and wasn't able to finish. Any interest in merging the code to support configuring multiple modes?

/mode plan rather than plan

vmiura avatar Nov 19 '25 14:11 vmiura

Thanks, @vmiura, for the feedback and for your previous work on this feature! It's great to see the continued interest.

Regarding the suggestion for /mode plan instead of /plan, the current implementation uses /plan as a simple toggle for a few reasons:

  • Simplicity: A single command to switch between the two primary modes (Agent and Plan) is straightforward for users.
  • Consistency: It aligns with other slash commands that act as toggles or single-action commands.

Your suggestion to support /mode <mode_name> is definitely interesting and could offer more flexibility, especially if we envision more modes in the future (e.g., a 'Review Mode' or 'Debug Mode'). This would allow for a more extensible mode management system.

For this initial PR, the focus was on establishing the core 'Plan Mode' functionality and its robust safety mechanisms. However, I agree that a more generalized /mode command could be a valuable enhancement for a follow-up PR. It would involve updating the command parsing and potentially the configuration management to handle multiple named modes.

Would you be interested in contributing to a follow-up PR for a more generalized mode switching mechanism? Your insights would be very valuable!

gemini-code-assist[bot] avatar Nov 19 '25 14:11 gemini-code-assist[bot]

@vmiura I'm afraid that this PR is already borderline with being too big, but I would be happy to take over #7193 as a separate PR once we merge these changes

linegel avatar Nov 19 '25 16:11 linegel

@chrstnb @mattKorwel could you please help with a review?

https://github.com/google-gemini/gemini-cli/blob/main/.github/CODEOWNERS doesn't explicitly define who the actual code owner(-s) of the changed parts are, so I can't mention them specifically

Reasoning to prioritise review: To me, the lack of plan mode is one of the key issues that prevents migration from Claude Code to Gemini-cli as a main tool, and using the extension just doesn't match the UX/DX that CC has to offer. It's likely to be the same for any power user who has been using it for at least a few months and will consider migrating to gemini-cli.

linegel avatar Nov 19 '25 17:11 linegel

For me, it's the same. This missing feature is the reason I haven't changed to Gemini over Claude Code. I think a lot of professionals feel the same. Please move this up your list of priorities.

nxcco avatar Nov 23 '25 10:11 nxcco

Plan mode is as fundamental to agentic coding as copy/paste for text editors.

cesarvarela avatar Nov 24 '25 19:11 cesarvarela

I'm a newcomer who recently started following the gemini cli source code. I noticed that SubagentToolWrapper doesn't seem to be within the scope of consideration?

Sugar-107 avatar Nov 25 '25 10:11 Sugar-107

Throwing my vote: With the codebase_investigator agent set to max_turns = -1, I'm starting to see more parity with claude code. I think getting this plan mode phase merged will make gemini-cli even more competitive. The pareto workflow (as defined by just cost :: output quality) now is to use claude code to understand the codebase and create a detailed plan, and then gemini-cli to execute. Would love to see a world where the pareto workflow is just gemini-cli!

navkast avatar Nov 25 '25 14:11 navkast

Hi @linegel this is awesome!

Will get a maintainer to take a closer look soon 👏

jackwotherspoon avatar Nov 26 '25 01:11 jackwotherspoon

@linegel is this PR ready for review?

Just tried to quickly test it out and it seems to be having some strange behavior?

image

Is this intended?

I also wonder if /plan is the easiest method to enable plan mode? Or if Shift+tab to cycle between "accepting edits", "plan mode" and regular agent mode similar to CC is a better UX experience. Open to discussion here.

Maybe other folks can weigh in on what they would like to see?

jackwotherspoon avatar Nov 26 '25 01:11 jackwotherspoon

Thanks for having a look @jackwotherspoon!

Nicely spotted, it is defintelly a bug in tool limitation in useGeminiStream

I also enhanced system prompt that is used for planing mode, so example of plan mode is on screenshot below:

Screenshot 2025-11-26 at 08 26 28 Screenshot 2025-11-26 at 08 27 05

linegel avatar Nov 26 '25 06:11 linegel

@jackwotherspoon

Another good point is using Shift+Tab to switch modes. If adopted, this would also help with migration from CC to GeminiCLI

However, I want to point out that Plan mode in CC evolved after numerous iterations, and I would recommend that we follow the same process here:

  1. Let's introduce a basic plan mode that is immediately helpful and allows the user to review changes planned by the model (this PR #13366).
  2. Then, we can consider adding a tool dedicated to planning, similar to CC, with options to:
    • Confirm plan and auto-approve changes
    • Confirm plan and manually approve changes
    • Decline plan and provide feedback on changes
  3. In parallel, via another PR, we can integrate the Shift+Tab mode switch. This needs to handle changes mid-session, even when the model has already started performing work

linegel avatar Nov 26 '25 06:11 linegel

Thanks for making the quick changes @linegel 👍

I want to point out that Plan mode in CC evolved after numerous iterations, and I would recommend that we follow the same process here

I totally agree with evolving and iterating on plan mode over time to make it more robust.

However, I think the UX part is very important to get right from the start. I will make sure someone from our UX team takes a look and can weigh in better on /plan vs Shift+tab and other UI elements we may want here. In my mind it is okay if the core feature evolves but ideally the UX remains the same (or as close to the same as we can)

jackwotherspoon avatar Nov 26 '25 12:11 jackwotherspoon

@jackwotherspoon got your point

Then I will need an input from UX team on how plan/work mode should be handled when toggled mid session 🙏

linegel avatar Nov 26 '25 12:11 linegel

Hi @linegel, nice work on this - plan mode is a much-needed feature.

One question: why is write_todos in the destructive tools list? It seems like writing a todo list would be useful in plan mode - the model could lay out its planned steps for the user to review before switching to agent mode.

Also, since toggling /plan mid-session doesn't update the system prompt until /clear is run, it might help to make that explicit in the info message, e.g., "Plan mode enabled. Run /clear for full effect."

afarber avatar Nov 29 '25 16:11 afarber

Thank you so much @linegel for all your hard work on this! We will definitely take everything you did in this PR under consideration as we work to implement plan mode and loop you in as we start to have PRs ready for review. However unfortunately we won't be able to accept this PR. There are a lot of nuances to how this will need to be implemented to make the experience completely seamless for users.

To make it clearer what issues we will accept community contributions on in the future, we have now added a 🔒 maintainer only label to issues to warn about ones where a community contribution is unfortunately unlikely to be something we can accept. We would welcome your contribution on any of the other high priority issues on the repo that doesn't have the maintainer only label.

jacob314 avatar Dec 10 '25 20:12 jacob314

Ok, thanks @jacob314

Happy to hear that this feature is being taken seriously

Another in-direct important reason for it is you guys being able to directly pass on feedback to teams that work on Gemini. Because currently, in all my tests and prompts, specifically planning phase is the weakest side for the model and has a lot of potential for improvement during post-training upgrades 👍

linegel avatar Dec 10 '25 22:12 linegel

Thank you for understanding! Looking forward to working with you on other issues in the future. Cases you mentioned are the sort of reason why this is a hard feature to land unless you are a maintainer.

jacob314 avatar Dec 12 '25 16:12 jacob314