larger code file editing/viewing improvements
I've been using Open Hands for a couple months now and I find str_replace_editor causes a lot of problems when working with code files longer than a few hundred lines.
Here are a few suggested improvements:
-
It would be nice if the view truncation limit was configurable on an advanced configuration level. I personally run DeepSeek-V3-0324 locally and I have a fair degree of control over how long the context length is and how long the max output per response is through
ktransformers. It wastes time whenstr_replace_editortruncates a 900 line file. I'd much prefer that it just spit out the file in its entirety. I realize not everyone wants this functionality when deal with pay per million token APIs, but it makes sense for me because I'm only paying for the cost of electricity. -
When writing unit tests for code of more than trivial complexity, it is common to have duplicate lines in the unit test file. This makes it extremely difficult to edit the file with
str_replace_editordue to the uniqueness constraint. Has anyone tried adding a line number option tostr_replace_editorso that the uniqueness constraint can be bypassed when a line number is also provided? My mind keeps coming back to thediffandpatchtools in unix systems as they appear to have solved this problem a long time ago.
Similar to https://github.com/All-Hands-AI/OpenHands/issues/6889
Today I've been experimenting with the following additions to my prompts. Honestly, they're working really well. I'm impressed.
Actions
-
You can view an entire file without truncation with
cat -n /workspace/path/to/file.spec.js | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/'. The sed portion helps you see leading whitespace clearly, ascat -nnormally adds a tab character after the line number, which is confusing. -
You can edit a file with the
patch --ignore-whitespace --verbose -r - -V neverCLI command, via the execute_bash tool. Callpatchlike this, replacing YOUR_DIFF_HERE with your diff:cat << 'EOF' | patch -p0 --ignore-whitespace --verbose -r - -V never /workspace/path/to/file.spec.js YOUR_DIFF_HERE EOF
Procedure
- Use
cat -n /path | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/'to read files. DO NOT USE str_replace_editor. It will truncate large files. - Use
patchto edit files and add tests. DO NOT USE str_replace_editor. It has trouble with large files.
That's very interesting, thank you for this!
I think maybe we will need to look for a way to add different editing tools to different LLMs. My current take at this is not very nice (it's a bit too hardcoded), but we can find a better way. Currently, we have made a step: we have the ability to disable all the default tools of the agent. (because we may need that when people use MCP servers with similar functionality or similar function names even, or for that matter totally different and they may want those different tools only.)
Maybe we could move forward with an idea like this, then see if we can implement some alternative tools.
Yes, a plugin or an MCP server would be ideal for this. I've seen the MCP discussions in the git issues. I just don't know how to use it yet as I'm running everything via the standard docker run method.
FYI, I wrote a tool called diffcalculia to automatically validate and fix the majority of AI unified diff errors. It is installed via pip. The new prompt snippet I'm using is:
# File Reading and Editing
1. Run this once via bash:
```bash
python3 -m venv my_venv && \
source my_venv/bin/activate && \
python3 -m pip install --force-reinstall git+https://github.com/createthis/diffcalculia.git
```
2. You can edit a file with the `patch` CLI command, via the execute_bash tool.
Call `patch` like this, replacing YOUR_DIFF_HERE with your diff:
```bash
cat << 'EOF' | diffcalculia --fix | \
patch -p0 --ignore-whitespace --verbose -r - -V never /workspace/file_to_edit
YOUR_DIFF_HERE
EOF
```
The `diffcalculia --fix` command will fix minor line count discrepancies for
you automatically!
3. Use `patch` to edit files and add tests. DO NOT USE str_replace_editor. It has
trouble with large files.
4. Use `cat -n /path | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/'` to read files. DO
NOT USE str_replace_editor. It will truncate large files.
I'm getting very good results with this and Deepseek-V2-0324.
I am in the process of encapsulating this into an MCP server. I think it's mostly done: https://github.com/createthis/diffcalculia_mcp
EDIT: Open Hands AI sees the tools! Pretty cool!
Thanks for noting this down, large files are headaches to deal with, but are there prompts for splitting large code files that are hard to use into smaller files, with proper file structure? Or is this a constraint introduced by certain programming languages?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for over 30 days with no activity.