gpt-engineer icon indicating copy to clipboard operation
gpt-engineer copied to clipboard

Better edit syntax for the improve command

Open ATheorell opened this issue 2 years ago • 0 comments

Feature description

gpt-engineer supports improving existing code with natural language prompts using the --improve flag. On a technical level, gpt-engineer improves code by sending the improve prompt and the code to the LLM and asking the LLM to provide improvements in the form of edit blocks on the format of the given example:

some/dir/example_1.py
<<<<<<< HEAD
    def mul(a,b)
=======
    def add(a,b):
>>>>>>> updated

This idea is largely borrowed from aider . However, this approach is rather error prone, since it either requires the LLM to output the HEAD part of the edit block to be identical to a part of the code, which quite often is not the case, or that some intelligent heuristics handle the case when no exact match is found. Problems with the current implementation are reported in #721 #814 #841.

An alternative way to make edits is to prompt the LLM to provide edits in the classic diff syntax

28 -    workspace = FileRepository(eval_ob["project_root"])
28 +    workspace = OnDiskRepository(eval_ob["project_root"])

This has the advantage that the file name + the line number uniquely defines where to put in the edit, enabling edits, even if the existing code is not reproduced perfectly. Line numbers are currently not stored and not provided to the LLM. The easiest way to do this is probably to implement a method that equips a code object with line numbers in the the Code class and update the to_chat method to provide the line numbers, exactly the way it looks in the diff syntax. The same line numbers should then be reused when parsing edits from the LLM back into the code.

Of course, to make the LLM understand what is going on, it is also necessary to modify the corresponding preprompt.

Motivation/Application

Making the --improve workflow 10x more reliable!

ATheorell avatar Nov 25 '23 12:11 ATheorell