aider
aider copied to clipboard
Add tool-using NavigatorCoder (/navigator mode), akin to Claude Code
One more PR, this one simultaneously very very complete and yet speculative.
In case anyone wants to try this out, I uploaded it to PyPI as navigator-mode, until (and of course if!) the PR is accepted. By I, I mean that it uploaded itself. You can see the session where it did that here: https://asciinema.org/a/9JtT7DKIRrtpylhUts0lr3EfY
Summary
This PR adds a NavigatorCoder (started by --navigator and /navigator) to aider, which is able to:
- Run various tools: <I had a list here, but I removed it because it got too long>.
- Edit files using the standard SEARCH/REPLACE behavior
- With
/granular-editingturned on, edit files using fine-grained edits through tool use, with feedback diffs sent to the LLM and the ability for the LLM to undo! - Send the directory structure, git status and a little history, to prime the LLM to look for files in the codebase appropriately.
- For testing, I most recommend Gemini 2.5 Pro or Sonnet.
Compatibility
I've tried very hard to keep changes isolated to just navigator_coder.py, navigator_prompts.py and aider/tools. There is currently the context management extension which adjusts BaseCoder, but it's very much optional, could be moved down to NavigatorCoder if that were preferable.
Quirks
Cost-optimization
I'm relatively confident that switching away from the standard aider-style mode of sending files with every commit would result in meaningful token- and therefore cost-savings, but the models seem to behave with meaningfully more intelligence when we can keep sending files in full.
I wonder if there might be room for a world in which the models only see that which they just asked for at any given time... or reduced chat history, or reduced use of reflected messages, or similar. For now I've gone for compatibility and highest-intelligence over cost savings as the default mode of operation. Perhaps it could be a toggle-able setting?
I had briefly integrated functionality to decay files not-in-use over time, as well as to reduce files exposed to the LLM, but my concern is that this may cause divergence from a task over a long series of steps in some cases.
Tools
Tools are stored in aider/tools and at the time of writing, I'm currently in the process of porting more of them to use tool_utils.py (without which they're somewhat duplicative). That's going very well, though.
This is best used with #3778
... because in some (rare) cases it will add and then remove large numbers of files
I added /context-management
This is a toggle-able truncation of large files, only enabled for Navigator by default. It sends a subset of the file, allowing the LLM to sort of see the file contents, but with less overwhelm. That way it has the chance to look at them and then remove them, for example.
Tool calling format and a (current) lack of output elision
I tried for a while to use XML-style tool calling tags, but they cause rich / mdstream to emit a lot of blank lines and the tags are stripped, leaving garbage in the output. Despite multiple attempts to suppress streaming or final output, I was unable (so far) to achieve something like 'hide everything inside a <function_calls> tag' from the output whilst also streaming.
As a result, I went with the other common tool call syntax of [...], using the very explicit [tool_call(Cmd, keyword_arg="something", other_arg="something")] to make false positives almost impossible and parsing simple and fast.
Tool calls are implemented very robustly, with a parens-scanning mini-parser and Python's ast.parse, so you can for example edit lines mentioning tool calls with tool calls, or SEARCH/REPLACE blocks with tool calls, etc.
SEARCH/REPLACE editing is stolen from the main editor
Editing seems relatively un-abstracted in the codebase right now, so Navigator uses a duplicate of the editing logic for that portion of things.
As I've started to add a /lot/ of granular tool-based editing, this is also de-emphasized and used only as a fallback.
Tests
I usually add tests to PRs, but I've avoided them here for now, both because it's been changing a great deal and because looped LLM calls probably deserve attention on testing.
I've been able to very successfully get Navigator to test itself, though! You can see a session here, where it looped through every tool in sequence, running it, checking that the file had been appropriately changed and then restoring changes with a command:
https://asciinema.org/a/eqnvZ57O7nVWkpc15NYO1tMHx
Bottom line
Having used Claude Code a great deal, this... behaves very much similarly and feels complete with the possible absence of #3672. I really appreciate that this is so relatively-straightforwardly possible, given how aider is implemented! Thank you again for producing aider as open source!
As mentioned above, this PR is very much speculative -- I'm using and developing it (with itself!) continuously, but I imagine that you may have your own plans or suggestions for this sort of functionality, so I very much defer to you!
Example sessions
Publishing self to PyPI and then later updating: https://asciinema.org/a/9JtT7DKIRrtpylhUts0lr3EfY https://gist.github.com/tekacs/b92d508a06b8f802611b00d1529c3907
Testing out all the tools: https://asciinema.org/a/eqnvZ57O7nVWkpc15NYO1tMHx
Looking for bugs in its own editing tools: https://gist.github.com/tekacs/c07454aec86b1e312cf03ece4e68e5a9
navigator> Run cowsay and validate that you get to see its output
Okay, I can run the cowsay command for you. What message would you like the cow to say?
[tool_call(Command, command_string="cowsay 'Moo!'")] [tool_call(Continue)]
Tokens: 19k sent, 47 received. Cost: $0.02 message, $0.02 session.
cowsay 'Moo!'
Allow execution of this command? (Y)es/(N)o/(D)on't ask again [Yes]:
⚙️ Executing non-interactive shell command: cowsay 'Moo!'
______
< Moo! >
------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
Continuing to explore the codebase based on tool results...
Yes, I successfully ran the cowsay 'Moo!' command and received the following output:
______
< Moo! >
------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
This confirms that I can execute shell commands and see their output.
Tokens: 19k sent, 79 received. Cost: $0.03 message, $0.05 session.
navigator> I swear that I added an undo-add command (cmd_undo_add) to this codebase. Can you figure out what happened to it?
Okay, I can help investigate that. It's possible the command was renamed, moved, or accidentally removed.
To figure this out, I'll start by searching the git history for any commits that added or removed the string cmd_undo_add within the aider/commands.py file, as that's the most likely place
for command definitions.
[tool_call(Command, command_string="git log -S cmd_undo_add -- aider/commands.py")] [tool_call(Continue)]
Tokens: 20k sent, 98 received. Cost: $0.03 message, $0.03 session.
git log -S cmd_undo_add -- aider/commands.py
Allow execution of this command? (Y)es/(N)o/(D)on't ask again [Yes]:
⚙️ Executing non-interactive shell command: git log -S cmd_undo_add -- aider/commands.py
Continuing to explore the codebase based on tool results...
Okay, the git log -S command didn't find any commits that introduced or removed the exact string cmd_undo_add in aider/commands.py.
This could mean a few things:
1 The command might have had a slightly different name (e.g., cmd_undoadd, cmd_add_undo).
2 It might have been added and removed in the same commit, or the change wasn't committed.
3 It might have been implemented in a different file (less likely for a command).
Let's broaden the search. I'll search the entire git history across all files for the string cmd_undo_add. This might take a bit longer.
[tool_call(Command, command_string="git log -S cmd_undo_add")] [tool_call(Continue)]
Tokens: 20k sent, 163 received. Cost: $0.03 message, $0.05 session.
git log -S cmd_undo_add
Allow execution of this command? (Y)es/(N)o/(D)on't ask again [Yes]:
⚙️ Executing non-interactive shell command: git log -S cmd_undo_add
Continuing to explore the codebase based on tool results...
Okay, the broader search across the entire git history also didn't find cmd_undo_add.
Let's try searching the current version of the aider/commands.py file for related keywords like "undo" or "remove" to see if the command exists under a different name or if related
functionality is present.
[tool_call(Add, file_path="aider/commands.py")] [tool_call(Grep, pattern="undo", file_pattern="aider/commands.py")] [tool_call(Grep, pattern="remove", file_pattern="aider/commands.py")]
[tool_call(Continue)]
Tokens: 20k sent, 123 received. Cost: $0.03 message, $0.08 session.
📎 Added 'aider/commands.py' to context as read-only
🔍 Found 'undo' in: aider/commands.py (12 matches)
🔍 Found 'remove' in: aider/commands.py (40 matches)
Continuing to explore the codebase based on tool results...
Okay, I've examined the aider/commands.py file that's now in our chat context.
Based on the current code:
1 There is no command defined as cmd_undo_add.
2 The command /undo (implemented by cmd_undo) is specifically for undoing the last git commit made by aider, not for undoing the addition of a file to the chat session.
3 The command /drop (implemented by cmd_drop) is used to remove files from the chat session. This is the command that effectively reverses the action of /add.
It seems most likely that:
• You might be thinking of the /drop command, which serves the purpose of removing files added via /add.
• Or, perhaps the command existed temporarily in your local workspace or a branch but wasn't committed or was later removed/renamed before being committed under that specific name. The git
history searches we ran didn't find any trace of cmd_undo_add.
Does the functionality of the /drop command match what you remember for /undo-add?
amar@teklin ~/r/e/aider ((c5de7c1e))> aider --model gemini --no-auto-commits --navigator
Aider v0.81.2.dev19+g8f9f9210.d20250410
Main model: gemini/gemini-2.5-pro-preview-03-25 with navigator edit format
Weak model: gemini/gemini-2.0-flash
Git repo: .git with 650 files
Repo-map: using 4096 tokens, files refresh
Cost estimates may be inaccurate when using streaming and caching.
navigator> Replace 'Continuing to explore the codebase based on tool results...' with something more terse.
Okay, I can help with that. I'll search for the phrase and propose a replacement.
[tool_call(Grep, pattern="Continuing to explore the codebase based on tool results...", file_pattern="aider/coders/*.py")]
[tool_call(Continue)]
Tokens: 20k sent, 60 received. Cost: $0.03 message, $0.03 session.
🔍 Found 'Continuing to explore the codebase based on tool results...' in: aider/coders/navigator_coder.py (1 matches)
Continuing to explore the codebase based on tool results...
Okay, I found the string in aider/coders/navigator_coder.py. I will make that file editable and propose the change.
[tool_call(MakeEditable, file_path="aider/coders/navigator_coder.py")]
aider/coders/navigator_coder.py
<<<<<<< SEARCH
self.reflected_message = "\n".join(next_prompt_parts)
self.io.tool_output("Continuing to explore the codebase based on tool results...")
return False # Indicate that we need another iteration
else:
# Exploration finished for this turn.
=======
self.reflected_message = "\n".join(next_prompt_parts)
self.io.tool_output("Continuing exploration...")
return False # Indicate that we need another iteration
else:
# Exploration finished for this turn.
>>>>>>> REPLACE
Tokens: 28k sent, 180 received. Cost: $0.04 message, $0.06 session.
📝 Made 'aider/coders/navigator_coder.py' editable
Detected edit blocks, applying changes within Navigator...
Applied edit to aider/coders/navigator_coder.py
haven't tried yet, but this sounds really cool, thank you for your effort!
I noticed that with yes-always: true, every command invocation is skipped.
I wondered if something like a requires_approval parameter (as seen in Cline/Roo Code) could be utilized here? Invoking git log should be pretty safe?
Or maybe a new config parameter like navigator-command-requires-confirmation: true?
I noticed that with
yes-always: true, every command invocation is skipped.
I wondered if something like a
requires_approvalparameter (as seen in Cline/Roo Code) could be utilized here? Invokinggit logshould be pretty safe?Or maybe a new config parameter like
navigator-command-requires-confirmation: true?
Oh, aider/tools/command.py already has requires_approval=True - my LLM (Deepseek v3) just decided to not set it to False. Maybe making it mandatory helps?
Oh, aider/tools/command.py already has requires_approval=True - my LLM (Deepseek v3) just decided to not set it to False. Maybe making it mandatory helps?
The issue is that aider/coders/navigator_prompts.py does not list the requires_approval parameter, thus it will never be used, and it is always True.
Prior to reading the source, I was hoping that the navigator_prompts.py constructs its prompt dynamically based on existing tools/files in the tools/ directory. That is (apparently?) not the case, and might be a bit hard to pull off, but might be something to consider, to not have the tools in the prompt deviate from the actual implementation?
Great suggestions @Dima-369, thank you. I've considered generating the prompt from the tools and that'll probably make it in soon.
What I'm currently quite stuck on is figuring out why I seemingly can't get Sonnet to cache prefixes of prompts reliably, despite efforts to get things in the right order, with seemingly solid cache_control markers.
Frankly... I can't seem to get it to work in general, even in /ask mode or similar.
This causes relatively runaway token usage, so I'm still investigating. If anyone stumbles across this comment and has any insight, it would be appreciated!
@tekacs thanks for this massive amount of work!!! did a short test, seems quite potent overall. one thing that for me personally currently keeps me going back to claude is that, aider currently only works on/with the files available in the main git repository, which sadly is problematic with multiple projects of mine, which consist of a main repository with multiple related folders that are themselves git repositories (symfony bundles). aider is currently basicaly not useful for combinations like this because it only knows/works with the files inside the git of the main repository. any plans/ideas to address stuff like this?
@tekacs thanks for this massive amount of work!!!
did a short test, seems quite potent overall.
one thing that for me personally currently keeps me going back to claude is that, aider currently only works on/with the files available in the main git repository, which sadly is problematic with multiple projects of mine, which consist of a main repository with multiple related folders that are themselves git repositories (symfony bundles). aider is currently basicaly not useful for combinations like this because it only knows/works with the files inside the git of the main repository.
any plans/ideas to address stuff like this?
Glad to hear it! It's definitely addressable, although the way that I've coded this PR is deliberately light on touching some base code in Aider, to make things more merge-able (although still some work required for that).
Something that I think would make it much easier to interact with non-Git files that I've introduced in this PR is a tool that allows the LLM to undo changes that it's made. This makes it less risky for the LLM to perform edits outside of the immediate Git repo.
But I think that teaching Aider to wrangle multiple Git repos at once would be a pretty hefty change to how the base works. That would be better for a different PR.
Within the confines of this one, if you want to give it a try, I would suggest asking an LLM to add a tool or two under aider/tools to allow it to manipulate and read those Git repositories. I would adjust enhanced context blocks to use something like fd instead of the current Git-based approach that's used here. Finally, I would allow MakeEditable and View to operate on non-repo files. Those three things together would probably make it work pretty well.
Once this stabilizes, I can certainly give it a go, but you might be able to take my comment and feed it to Navigator (I'd use Gemini) or to Claude Code and have it generate a good working prototype for you.
Great suggestions @Dima-369, thank you. I've considered generating the prompt from the tools and that'll probably make it in soon.
What I'm currently quite stuck on is figuring out why I seemingly can't get Sonnet to cache prefixes of prompts reliably, despite efforts to get things in the right order, with seemingly solid
cache_controlmarkers.Frankly... I can't seem to get it to work in general, even in
/askmode or similar.This causes relatively runaway token usage, so I'm still investigating. If anyone stumbles across this comment and has any insight, it would be appreciated!
I experimented with /copy-context and noticed that repo_map appears at the beginning. This means if the map changes, it could invalidate part of the subsequent cache. However, I believe this behavior is consistent with the current stable version.
Regarding your comment about can't get Sonnet to cache prefixes of prompts reliably, I'm not entirely clear on what you mean. The Aider documentation does mention: Aider can ping the provider to keep your prompt cache warm and prevent it from expiring. By default, Anthropic keeps your cache for 5 minutes. Use --cache-keepalive-pings N to tell aider to ping every 5 minutes to keep the cache warm. Aider will ping up to N times over a period of N*5 minutes after each message you send.
I also briefly tested Navigator and was impressed with the experience! The automated processes add minimal overhead, which was a pleasant surprise - especially coming from a CLI background. One note: currently, running aider --navigator doesn't automatically enter navigator mode; you need to manually switch using /chat-mode navigator.
Great suggestions @Dima-369, thank you. I've considered generating the prompt from the tools and that'll probably make it in soon. What I'm currently quite stuck on is figuring out why I seemingly can't get Sonnet to cache prefixes of prompts reliably, despite efforts to get things in the right order, with seemingly solid
cache_controlmarkers. Frankly... I can't seem to get it to work in general, even in/askmode or similar. This causes relatively runaway token usage, so I'm still investigating. If anyone stumbles across this comment and has any insight, it would be appreciated!I experimented with
/copy-contextand noticed thatrepo_mapappears at the beginning. This means if the map changes, it could invalidate part of the subsequent cache. However, I believe this behavior is consistent with the current stable version.Regarding your comment about
can't get Sonnet to cache prefixes of prompts reliably, I'm not entirely clear on what you mean. The Aider documentation does mention:Aider can ping the provider to keep your prompt cache warm and prevent it from expiring. By default, Anthropic keeps your cache for 5 minutes. Use --cache-keepalive-pings N to tell aider to ping every 5 minutes to keep the cache warm. Aider will ping up to N times over a period of N*5 minutes after each message you send.I also briefly tested Navigator and was impressed with the experience! The automated processes add minimal overhead, which was a pleasant surprise - especially coming from a CLI background. One note: currently, running
aider --navigatordoesn't automatically enter navigator mode; you need to manually switch using/chat-mode navigator.
Hullo! I'm glad that it's felt smooth! If you look at the screenshot below, you'll notice that the --navigator flag CLI seems to work for me? If you're certain that this doesn't work for you, I can check if e.g. I need to re-publish to pip or similar. Also, as a shorthand, there's also a /navigator command too.
On caching -- yeah, the base repo seems to put the repo map up top, which... maybe is worth fiddling with. After some experimentation, I did find that I was able to get some caching to work, but I still haven't fully figured out its dynamics. In particular I want to make sure that we don't send too many un-cached tokens Sonnet's way, since they rate limit based on un-cached amount.
Speaking of CLI, with the release of https://github.com/openai/codex, I notice that they use a lot of CLI tools directly in place of some tools that we use here. While I could imagine that causing divergence across platforms and across models, I wonder if that's a viable option to boost some models' performance?
Great suggestions @Dima-369, thank you. I've considered generating the prompt from the tools and that'll probably make it in soon. What I'm currently quite stuck on is figuring out why I seemingly can't get Sonnet to cache prefixes of prompts reliably, despite efforts to get things in the right order, with seemingly solid
cache_controlmarkers. Frankly... I can't seem to get it to work in general, even in/askmode or similar. This causes relatively runaway token usage, so I'm still investigating. If anyone stumbles across this comment and has any insight, it would be appreciated!I experimented with
/copy-contextand noticed thatrepo_mapappears at the beginning. This means if the map changes, it could invalidate part of the subsequent cache. However, I believe this behavior is consistent with the current stable version. Regarding your comment aboutcan't get Sonnet to cache prefixes of prompts reliably, I'm not entirely clear on what you mean. The Aider documentation does mention:Aider can ping the provider to keep your prompt cache warm and prevent it from expiring. By default, Anthropic keeps your cache for 5 minutes. Use --cache-keepalive-pings N to tell aider to ping every 5 minutes to keep the cache warm. Aider will ping up to N times over a period of N*5 minutes after each message you send.I also briefly tested Navigator and was impressed with the experience! The automated processes add minimal overhead, which was a pleasant surprise - especially coming from a CLI background. One note: currently, runningaider --navigatordoesn't automatically enter navigator mode; you need to manually switch using/chat-mode navigator.Hullo! I'm glad that it's felt smooth! If you look at the screenshot below, you'll notice that the
--navigatorflag CLI seems to work for me? If you're certain that this doesn't work for you, I can check if e.g. I need to re-publish to pip or similar. Also, as a shorthand, there's also a/navigatorcommand too.
On caching -- yeah, the base repo seems to put the repo map up top, which... maybe is worth fiddling with. After some experimentation, I did find that I was able to get some caching to work, but I still haven't fully figured out its dynamics. In particular I want to make sure that we don't send too many un-cached tokens Sonnet's way, since they rate limit based on un-cached amount.
Speaking of CLI, with the release of https://github.com/openai/codex, I notice that they use a lot of CLI tools directly in place of some tools that we use here. While I could imagine that causing divergence across platforms and across models, I wonder if that's a viable option to boost some models' performance?
I noticed that you are using Claude from the OpenRouter provider. I remember it uses the OpenAI format. Unlike Anthropic, it only supports caching of the text part and does not support caching of the function call part.
Is it working like Aider as an Agent?
I feel this PR conflicts with MCP support PR (#3937). It's probably unwise to have different implementations for how tools work with aider, and I feel MCP is probably the way to go for aider to enable custom tools and extensions
I feel this PR conflicts with MCP support PR (#3937). It's probably unwise to have different implementations for how tools work with aider, and I feel MCP is probably the way to go for aider to enable custom tools and extensions
Yes and no, both approaches have their place / role to play.
MCP does include a back and forth with the LLM and increases token consumption.
sequenceDiagram
participant User
participant CLI as CLI Tool (LLM Client)
participant LLM as Large Language Model
participant MCP as MCP Server
participant Tools as External Tools/Resources
User->>CLI: 1. Gives instruction/query
CLI->>LLM: 2. Forwards instruction/query (Prompt)
activate LLM
LLM->>LLM: 3. Reasons & decides on action/tool
alt LLM needs information/action from MCP
LLM->>CLI: 4. Requests tool invocation/resource access (Tool Call/Resource Request)
CLI->>MCP: 5. Forwards JSON-RPC request (e.g., `tools/call`, `resources/get`)
activate MCP
MCP->>Tools: 6. Executes tool / Retrieves resource
activate Tools
Tools-->>MCP: 7. Returns result/data
deactivate Tools
MCP-->>CLI: 8. Returns JSON-RPC response (Result/Error)
deactivate MCP
CLI->>LLM: 9. Returns MCP response
LLM->>LLM: 10. Incorporates new information into reasoning
end
LLM-->>CLI: 11. Provides final response/action
deactivate LLM
CLI-->>User: 12. Displays final output/result
Depending on what the mcp server does, it can be wild over-engineering to force everything as MCP.
Interesting read to compare / understand / improve this https://gerred.github.io/building-an-agentic-system/
Note: there is another open PR which adds MCP support here. Perhaps this PR should use the MCP functionality from that PR?
Guys, this is the best feature Aider has; why not prioritize it over other things? The only reason I stopped using this project was because this feature was missing.
Just wanted to jump in and say – thanks for taking some initiative here, @dwash96 – and thanks to folks who've been trying this out.
Please do let me know any issues that y'all run into – in the background I've been throwing together another full-featured coding agent (as a library – will be at tekacs/libca) in Rust (non-commercially / not for my gain – my actual startup does something totally unrelated).
But – I have a bunch of queued work to improve navigator mode and I'm happy to jump in and work on it if there are any particular questions or considerations any of you have run into.
I'd love to see this merged. Curious on why there are 170+ PRs open (being that aider is being mostly written by aider, maybe same could be done for PR reviews)?
Would love to see aider continue to be the top dawg, but the industry is moving fast and if this PR bottleneck doesn't get resolved I'd hate to see aider get left in the dust...

