markbind icon indicating copy to clipboard operation
markbind copied to clipboard

LLM-assisted codebase exploration for developer onboarding

Open yyccbb opened this issue 10 months ago • 1 comments

Please confirm that you have searched existing issues in the repo

Yes, I have searched the existing issues

Any related issues?

No response

What is the area that this feature belongs to?

Other

Is your feature request related to a problem? Please describe.

The huge codebase posted a real challenge for me to understand. The debugger offers great help in tracing through the codebase line-by-line, but it is sometimes too inefficient. LLMs on the other hand, can quickly answer context-specific questions if properly fine-tuned for the codebase.

Describe the solution you'd like

Currently, my approach is as follows:

  1. Use OneFileLLM to aggregate the codebase into a text file.
  2. Use Page Assist as the web UI for connecting to LLM APIs.
  3. Use nomic-embed-text for tokenization.
  4. Use DeepSeek-R1 free API provided by OpenRouter as the language model.

Describe alternatives you've considered

I have also tried with a locally deployed 14B DeepSeek-R1 distilled model, but the performance is not satisfactory.

Additional context

I wonder whether some instructions for setting up such an environment can be provided in the DG as an interesting add-on to assist new developers in exploring the codebase.

yyccbb avatar Feb 17 '25 06:02 yyccbb

interesting idea, i certainly use AI (github copilot) to help me understand a portion of code (e.g. a function/class/file) at a time, but usually not all at once.

tlylt avatar Mar 11 '25 14:03 tlylt