OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

larger code file editing/viewing improvements

Open createthis opened this issue 7 months ago • 7 comments

I've been using Open Hands for a couple months now and I find str_replace_editor causes a lot of problems when working with code files longer than a few hundred lines.

Here are a few suggested improvements:

  1. It would be nice if the view truncation limit was configurable on an advanced configuration level. I personally run DeepSeek-V3-0324 locally and I have a fair degree of control over how long the context length is and how long the max output per response is through ktransformers. It wastes time when str_replace_editor truncates a 900 line file. I'd much prefer that it just spit out the file in its entirety. I realize not everyone wants this functionality when deal with pay per million token APIs, but it makes sense for me because I'm only paying for the cost of electricity.

  2. When writing unit tests for code of more than trivial complexity, it is common to have duplicate lines in the unit test file. This makes it extremely difficult to edit the file with str_replace_editor due to the uniqueness constraint. Has anyone tried adding a line number option to str_replace_editor so that the uniqueness constraint can be bypassed when a line number is also provided? My mind keeps coming back to the diff and patch tools in unix systems as they appear to have solved this problem a long time ago.

Similar to https://github.com/All-Hands-AI/OpenHands/issues/6889

createthis avatar Apr 27 '25 16:04 createthis

Today I've been experimenting with the following additions to my prompts. Honestly, they're working really well. I'm impressed.

Actions

  1. You can view an entire file without truncation with cat -n /workspace/path/to/file.spec.js | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/'. The sed portion helps you see leading whitespace clearly, as cat -n normally adds a tab character after the line number, which is confusing.

  2. You can edit a file with the patch --ignore-whitespace --verbose -r - -V never CLI command, via the execute_bash tool. Call patch like this, replacing YOUR_DIFF_HERE with your diff:

    cat << 'EOF' | patch -p0 --ignore-whitespace --verbose -r - -V never /workspace/path/to/file.spec.js
    YOUR_DIFF_HERE
    EOF
    

Procedure

  1. Use cat -n /path | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/' to read files. DO NOT USE str_replace_editor. It will truncate large files.
  2. Use patch to edit files and add tests. DO NOT USE str_replace_editor. It has trouble with large files.

createthis avatar Apr 28 '25 23:04 createthis

That's very interesting, thank you for this!

I think maybe we will need to look for a way to add different editing tools to different LLMs. My current take at this is not very nice (it's a bit too hardcoded), but we can find a better way. Currently, we have made a step: we have the ability to disable all the default tools of the agent. (because we may need that when people use MCP servers with similar functionality or similar function names even, or for that matter totally different and they may want those different tools only.)

Maybe we could move forward with an idea like this, then see if we can implement some alternative tools.

enyst avatar Apr 29 '25 18:04 enyst

Yes, a plugin or an MCP server would be ideal for this. I've seen the MCP discussions in the git issues. I just don't know how to use it yet as I'm running everything via the standard docker run method.

createthis avatar Apr 29 '25 18:04 createthis

FYI, I wrote a tool called diffcalculia to automatically validate and fix the majority of AI unified diff errors. It is installed via pip. The new prompt snippet I'm using is:

# File Reading and Editing

1. Run this once via bash:

   ```bash
   python3 -m venv my_venv && \
   source my_venv/bin/activate && \
   python3 -m pip install --force-reinstall git+https://github.com/createthis/diffcalculia.git
   ```

2. You can edit a file with the `patch` CLI command, via the execute_bash tool.
   Call `patch` like this, replacing YOUR_DIFF_HERE with your diff:

   ```bash
   cat << 'EOF' | diffcalculia --fix | \
     patch -p0 --ignore-whitespace --verbose -r - -V never /workspace/file_to_edit
   YOUR_DIFF_HERE
   EOF
   ```
   
   The `diffcalculia --fix` command will fix minor line count discrepancies for 
   you automatically!

3. Use `patch` to edit files and add tests. DO NOT USE str_replace_editor. It has
   trouble with large files.

4. Use `cat -n /path | sed -E 's/^([[:space:]]*[0-9]+)\t/\1|/'` to read files. DO 
   NOT USE str_replace_editor. It will truncate large files.

I'm getting very good results with this and Deepseek-V2-0324.

createthis avatar May 04 '25 05:05 createthis

I am in the process of encapsulating this into an MCP server. I think it's mostly done: https://github.com/createthis/diffcalculia_mcp

EDIT: Open Hands AI sees the tools! Pretty cool!

createthis avatar May 11 '25 18:05 createthis

Thanks for noting this down, large files are headaches to deal with, but are there prompts for splitting large code files that are hard to use into smaller files, with proper file structure? Or is this a constraint introduced by certain programming languages?

BradKML avatar May 15 '25 02:05 BradKML

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 15 '25 02:06 github-actions[bot]

This issue was closed because it has been stalled for over 30 days with no activity.

github-actions[bot] avatar Jun 23 '25 02:06 github-actions[bot]