Devyan icon indicating copy to clipboard operation
Devyan copied to clipboard

what if i want it to work on existing codebase and not from scratch ?

Open hemangjoshi37a opened this issue 1 year ago • 13 comments

I want it to work on my existing project with multiple code files and with nested folders and multimodality with local models like ollama and lite-llm

hemangjoshi37a avatar Jul 12 '24 07:07 hemangjoshi37a

Yup, that is the next step. Basically "build your own Intern". It will take some time building tools and logic, but it sure is possible.

Right now Devyan is in early stage where We get applications from text promts

theyashwanthsai avatar Jul 12 '24 07:07 theyashwanthsai

Please let me know how can i help in this direction . Like you can suggest any edit to a file or something , i will update and PR it.

hemangjoshi37a avatar Jul 12 '24 09:07 hemangjoshi37a

I see.

I will let you know. I myself dont have a rough idea on how to implement. but i will let you know what can be done

theyashwanthsai avatar Jul 12 '24 09:07 theyashwanthsai

ok. Can we use graphrag : https://github.com/microsoft/graphrag for context generation using vector search ? or how can we feed inital context or base context into this system ? If we can feed base knowledge at the initial stage then we can achieve this .

hemangjoshi37a avatar Jul 12 '24 09:07 hemangjoshi37a

We should also make tools which can edit a file instead of rewriting. this will be tricky for the agent to edit in a file.

theyashwanthsai avatar Jul 12 '24 09:07 theyashwanthsai

it is simple , we can use prompt something like this :

you always response in git blame format with `-` and `+`  at the beginning of the line with few upper and lower lines for the reference . also give line numbers where edits are made.

using this we can parse edits from the response and replace it with existing code and run git commit with generated commit message .

hemangjoshi37a avatar Jul 12 '24 09:07 hemangjoshi37a

I see, But the problem with that approach is that we might hit token limits if the context from one task is too much. But can definitely give this approach a try

theyashwanthsai avatar Jul 12 '24 09:07 theyashwanthsai

We can do some tricks here with that approach, Works perfectly

theyashwanthsai avatar Jul 12 '24 09:07 theyashwanthsai

if we can decode what cursor.sh is doing then we can do this. if you know cursor IDE , it is a AI coding IDE based on VS Code , it is very good to use but one only and big problem is it is closed source and i dont know how it accesses information across different files . may be it uses some sophisticated RAG system specially designed code coding text.

hemangjoshi37a avatar Jul 12 '24 09:07 hemangjoshi37a

Yup. Theres also another approach where we programmatically create code blocks (Grouping snippets). And the operations can be done on these blocks. Not sure how good this idea might be in practical case wrt costs

theyashwanthsai avatar Jul 12 '24 09:07 theyashwanthsai

please review #6 for this

hemangjoshi37a avatar Jul 12 '24 11:07 hemangjoshi37a

Yup. Theres also another approach where we programmatically create code blocks (Grouping snippets). And the operations can be done on these blocks. Not sure how good this idea might be in practical case wrt costs

Use AST to find codebase dependancy and represent it to LLM(use as a context in prompt), and focus on the code snippets you want to update or write. That may solve the long context token problem.

jojogh avatar Jul 13 '24 04:07 jojogh

This might help with editing code:

https://aider.chat/docs/unified-diffs.html

nacho-villanueva avatar Jan 08 '25 17:01 nacho-villanueva