starpilot
starpilot copied to clipboard
Enhance document content intelligently
It might be valuable to call an llm during read time to enhance document data. 2 operations might be useful
- At document creation, before vectorstor embedding, an llm could be tasked to catagorise the content based on user defined tags. e.g. if the repo is to do with data science, or if it is likely a repo from a hackathon and not a module to be used in other projects.
- To use a form of Hypothetical document embedding at ETL time, e.g. supply the repo readme.md content to a llm with the prompt "based on this readme.md what is the intentended use case for this repository, and what problems does it solve/ tasks does it help with?"