Add a way for GPT Pilot to work on an existing codebase
Working on new codebases is the fun part of development. To really remove the drudgery, we need to be able to have GPT Pilot read an existing codebase. My thoughts as to how this should work:
- Each file should be examined (maybe even use the Unix 'file' utility?) to determine what technology is used
- Feed the file to something that will give a list of all classes/functions in the file
- Each function should be fed to a Developer/CodeMonkey agent (or alternately, a Technical Writer agent) which converts the function to a description of its functionality.
- These should be compiled and fed to a Tech Lead (or architect) which will then have:
- A list of the technologies used
- A description of the general architecture
At that point, it should be possible to compile prompts that favour technologies already used.
As a bonus, it would also be possible to use GPT Pilot to work on GPT Pilot.
Related, it needs to be able to diagnose a previous project. I have one that has a loose call (/j in a js file) which causes pilot to crash, but re-initializing the project just runs into the same error. I tried creating another project to diagnose the first, but it also crashed, possibly while completely ignoring the actual request in favor of building a whole new project based on the first. In fact, this could be implemented simply, by adding a query when a project is resumed like "Should we resume where we left off?" followed by the debug prompt if no
As well as gather the the latest information about the packages versions and compatibility, ie Pypi, because OpenAI is good but not up to date. But yes fixing or expanding existing code would be really an asset to have.
Apparently now it can work onto code it has been doing? But what about injecting another code base in the filesystem after naming the app but before describing, the in the prompt you refer to the existing code base, analysing it would be the first step? Then again it needs testing maybe it will bug with the workflow designed for new code.
I'd also like this use case. I think it requires some modified Agents and Agent communication structure, probably another LLM with larger context window (for processing existing codebase; like Claude) for at least one of the Agents.
An extension of the original idea:
- Technical Writer reads the documentation and turns it into user stories
- Test Developer reads the tests and turns them into user stories
- SomeAgent (Tech Lead?) tries to match up the user stories written by the Technical Writer, the Developer, and the Test Developer, and attempts to reconcile differences by querying each agent
The functionality mentioned by @cocobeach would be a task for either a) DevOps, or b) an additional task for the Developer/CodeMonkey.
HTH,
I was thinking about this today, and having a small low impact model run locally and read through the code base, then hand the requested information to the full gpt could accomplish this without being very expensive token checks to upload your entire base.
Hi guys, any update on this topic?
duplicate: https://github.com/Pythagora-io/gpt-pilot/issues/381