Aider analysis of editing failures from

Open jmarkmorris opened this issue 8 months ago • 0 comments

Issue

I had aider and gemini 2.5 pro do this. The question failed, but I was able to scrape the analysis from the aider chat log. I will spare you the detail and include only the key portions. It was an 800K token request. Anyway, I am liking the idea of having Aider analyze its own chat log to diagnose challenges. If this is a helpful issue then great, if not feel free to close it.

Analyze the editing failures encountered during a collaborative coding session using Aider with various Large Language Models (LLMs) and edit formats (diff, whole file). The goal is to identify the root causes of these failures, understand the challenges involved in AI-assisted code editing, and potentially inform strategies to mitigate such issues in the future. The analysis is based on the detailed chat history provided in .aider.chat.history.md. My aider chat history contained 39 editing issues.

Root Cause Categories & Counts:

LLM Context/State Mismatch: 8 instances
Redundant Edit Generation: 13 instances
LLM Search Block Generation Error (Insufficient Context): 3 instances
LLM Search Block Generation Error (Incorrect Context): 3 instances
LLM File Targeting Error: 2 instances
Unknown/Tool Issue?: 10 instances

Conclusions:

Context Management is Key: The most frequent identifiable issue was the LLM generating edits based on an outdated understanding of the file's current state. This highlights the difficulty in maintaining perfect context synchronization in a conversational coding workflow, especially when edits fail or are applied manually. Switching to whole-file editing mode helped mitigate this later in the session.
Redundant Edits: The LLM often proposed changes that had already been made, suggesting it sometimes failed to recognize the current state or re-proposed edits after a previous failure without checking if the change was now unnecessary.
Search Block Precision: Several failures stemmed from the LLM not generating a SEARCH block that exactly matched the target code, either by including too little surrounding context or having minor discrepancies (whitespace, slightly different lines).
Unexplained Failures: A significant number of failures occurred where Aider's feedback indicated the SEARCH block did match the file content. These are harder to diagnose definitively but could point to subtle, non-visible character differences (like line endings or whitespace types) or potential inconsistencies in Aider's matching/application logic, especially with the diff format used initially.
Tooling Interaction: The interplay between the LLM (generating edits), Aider (applying edits and managing context), and the user (confirming changes, potentially making manual edits) creates opportunities for mismatches.

Version and model info

My aider chat history included just before Aider v0.82.0 and just after. So some of the issues may have been addressed in Aider v0.82.0. I was mostly using gemini 2.5 pro, gemini 2.0 flash, claude 3.7 sonnet and claude 3.5 haiku.

Apr 26 '25 23:04 jmarkmorris