LLM-assisted codebase exploration for developer onboarding
Please confirm that you have searched existing issues in the repo
Yes, I have searched the existing issues
Any related issues?
No response
What is the area that this feature belongs to?
Other
Is your feature request related to a problem? Please describe.
The huge codebase posted a real challenge for me to understand. The debugger offers great help in tracing through the codebase line-by-line, but it is sometimes too inefficient. LLMs on the other hand, can quickly answer context-specific questions if properly fine-tuned for the codebase.
Describe the solution you'd like
Currently, my approach is as follows:
- Use OneFileLLM to aggregate the codebase into a text file.
- Use Page Assist as the web UI for connecting to LLM APIs.
- Use nomic-embed-text for tokenization.
- Use DeepSeek-R1 free API provided by OpenRouter as the language model.
Describe alternatives you've considered
I have also tried with a locally deployed 14B DeepSeek-R1 distilled model, but the performance is not satisfactory.
Additional context
I wonder whether some instructions for setting up such an environment can be provided in the DG as an interesting add-on to assist new developers in exploring the codebase.
interesting idea, i certainly use AI (github copilot) to help me understand a portion of code (e.g. a function/class/file) at a time, but usually not all at once.