gpt-engineer
gpt-engineer copied to clipboard
Improve existing codes
As requested in #79, #95 and #131 I did a proof of concept on how we could add new functionalities or fix errors from a set of existing code.
To use the implementation, we need to do the following setup:
- In a file called
file_list.txt
in the root of your project, add a list of code file paths, one in each line. These files should be enough to give the AI agent context to the modification you want to do, but it does not need to be complete standalone execution. - Using the
prompt
file we add a description of the modification we need to be done. It could be adding a docstring, adding new functions, classes, or even asking to correct a bug. - Call the modification process by using
gpt_engineer projects/test --steps improve_code
In this stage of implementation, we are sending all code text in a single message which is not ideal, so it only works on code small enough to be inside a prompt message, but it is a start.
Here is a small example:
This PR is already a working solution. In a future modification, I plan to separate the codes into different messages to see the results.
What do you guys think?
Hey @leomariga great initiative!
Question:
How do you see this being used?
Basically, how does one set up gpt-engineer for a new project?
I'm thinking two options:
- create new project with prompt file + memory folder, where
workspace
symlinks to the existing codebase - gpt-engineer creates a
.gpteng
folder that contains prompt andmemory
WDYT? Maybe there is a third?
Also general feedback:
- Python style convention is to use snake case
- I think requiring a file_paths.txt is too complex for the user :)
Created a discussion in discord around this
Thank you for your feedback and review =)
I agree with you that manually writing file_paths.txt
is not the best user experience. I didn't put much thought into how would be the input method. I also liked the idea of using the .gpteng
folder.
(Brainstorming) We could do something like this:
- At the root of the existing code repo we call something like
gpt-engineer -improve
which will create the.gpteng
folder. - We write in the terminal what we want to modify. This string will be stored in the
prompt
file in.gpteng
folder. - The tree of files of the project is displayed in the terminal and we select it with numbers (or we could even open the file explorer for the user!)
- The gpt-engineer does its job and automatically replaces the file in the end.
I saw the #contributors channel in the discord, but I don't have access to it yet. Let's wait to see the ideas there. =)
Hey, thanks for all the feedback and ideas. I'm working on your suggestions.
I was thinking we could change the file chat_to_files.py
to something like utils.py
or tools.py
and use it to put utility functions, like reading/parsing/writing files or strings.
The methods I put there do not make sense with the file name. I think I should change the location of these methods or rename the file. What do you guys think?
I'm open to ideas.
I think what you say makes sense!
I really prefer specific file names though so it’s easy to find things. Could have chat_to_files and files_to_chat as separate files?
very happy you’re working on this, it’s the top requested feature by far.
As for which files to add, I think we should keep it simple at first and just:
- ask if all files in selected path argument should be included
- if not, let user use arrow keys and enter to toggle which files/folders should be included (I believe there must be a cli arrow select dependency we could use)
Or we just do this: We run the tree command, and then we open the output in a text editor, and then the user can delete rows form it and everything that is not deleted when editor is closed is used.
What do you think?
There you go.
- Now we only need to call
gpt-engineer --improve
in the root folder of an existing project. A.gpteng
folder will be created by gpt-engineer. - I did some experiments in opening a selection window with python's tkinter instead of selecting the files with the terminal. I think it is much easier to use a gui, but we could implement both if someone insists on using the terminal.
I send a video of this feature working on the discord page.
I've started on a version of this. My approach is to start with the entry point(s), and iteratively include dependencies within the project. Currently creates .filename.meta files for each analyzed file; not unlike .h library files you would have in C describing the contents of the associated .c file.
I've started on a version of this. My approach is to start with the entry point(s), and iteratively include dependencies within the project. Currently creates .filename.meta files for each analyzed file; not unlike .h library files you would have in C describing the contents of the associated .c file.
Nice. The file selection on this solution is the function ask_for_files()
. We can add more input methods there if we want =)
Left some comments!
Also adding some thought for future work:
would be great to automatically print the diffs from the generation and ask the human if they like the diffs or not
Quick progress update. Now we can choose how to select the files as follows:
Still working on other stuff.
A note for you, the naming convention for the methods is incorrect.
If that's important enough, maybe you'd want to fix it.
I am also working on this feature. I have sent you a pull request, @leomariga. @AntonOsika, do you think it would be a good idea to create a branch for this?
I think creating a CLI that you can interact with while gpt-4 is editing the code to be able to iterate indefinitely and tell it to delete, add or change anything. This could make the process of developing software really efficient. Something like a ChatGPT but with code. Like the code interpreter of ChatGPT Plus but better :)
But to achieve this it would be essential to get it to always run the code correctly, without problems of paths, setting up files or typing extra commands.
Tried out PR agent:
PR Analysis
- 🎯 Main theme: Adding functionality to improve existing code
- 🔍 Description and title: Yes
- 📌 Type of PR: Enhancement
- 🧪 Relevant tests added: No
- ✨ Focused PR: Yes, the PR is focused on adding a new feature to improve existing code. All changes are related to this feature.
- 🔒 Security concerns: No, the PR does not introduce any obvious security concerns. However, it's always a good practice to handle file operations carefully to prevent any potential security issues.
PR Feedback
-
💡 General PR suggestions: The PR is well-structured and the code changes are well-documented. However, it lacks tests to ensure the new functionality works as expected. It would be beneficial to add unit tests for the new functions and integration tests to ensure the new feature works with the existing code.
-
🤖 Code suggestions:
-
relevant file: gpt_engineer/chat_to_files.py suggestion content: Consider handling exceptions when opening and writing to files. This can prevent the program from crashing if there are issues with file permissions or if the file does not exist. [important]
-
relevant file: gpt_engineer/file_selector.py suggestion content: The
ask_for_files
function could be refactored to reduce its complexity. Consider breaking it down into smaller, more manageable functions. This would improve readability and maintainability. [medium] -
relevant file: gpt_engineer/main.py suggestion content: Consider adding a validation for the
improve_option
argument. If it is not a boolean, the program may behave unexpectedly. [medium] -
relevant file: gpt_engineer/steps.py suggestion content: The
improve_existing_code
function is quite long and does a lot of things. Consider breaking it down into smaller functions to improve readability and maintainability. [medium]
-
Great to hear, screenshot looks good!
Hey @leomariga – any news on this?
Done. I added some consideration @AntonOsika
I just want to point out that @lectair did interesting modifications in a PR to my branch, but I didn't get a response from him after a review I needed to merge into my branch.
I think we can open a new PR with his changes in the official repo when this PR is merged.
Merged with commits from @lectair =)
Hey Leo great to see this ready to be merged.
Last step, which will take some time, is to merge / rebase in main and resolve conflicts so we can merge.
While doing it I have some quick final improvements:
- Stop using a directory called workspace when we run the -i command: It should always be run from the directory that one is currently in. Currently, a folder called workspace is created.
- See my comments in the PR
Also, small optional improvements:
- Make the "user file_txt" option for selecting files first and default
- Number the file selection alternatives from 1 (not 0)
- Consider not listing any
venv
ornode_modules
folders (as they have so many files)
One more thing:
We can simplify the "improve code" prompt, and use another final system message focused on explicitly requesting the output to be in the format we want, just like the gen_code
step does (it uses the use_qa
prompt).
I think we should do it.
When I tried it, it didn't give the right format (see screenshot)
Marking this as stale.
Would be great to pick it up, hopefully we don't have to close it.