twinny icon indicating copy to clipboard operation
twinny copied to clipboard

Document RAG

Open h3x4g0ns opened this issue 8 months ago • 2 comments

Is your feature request related to a problem? Please describe. I understand the the current approach uses some heuristics like document title or something in order to determine with documents are relevant for context. This is obviously a naive approach and doesn't capture the nuances and subtleties when trying to ask questions or perform systems design based on your existing code.

Describe the solution you'd like Let's implement document RAG. We need 2 elements with RAG: 1) knowledge persistency and 2) knowledge retrieval.

  1. Knowledge Persistency: We need to treat each file as document containing relevant information. For this we need to store it's embedding (which serves as a document hash) and the actual content. The easiest way to do this is use Ollama's newest embedding feature and register callback for editor activity such once changes have been made a document and we detect a silent period, we feed the entire document through the embedding model and return it's vector. In terms of data persistency, it becomes trivial to maintain a local storage dump (ideally we use Postgres) to main a key value store of embedding to document paths. We would probably maintain an index per workspace, and there would be a delay in building an index from scratch when loading a new workspace.

  2. Knowledge Retrieval: Upon prompting a completion query, we can embed the query and use nearest neighbors to determine which documents are the most relevant in the completion of the request. Based on the input context length of the model selected, we can select how many of these chunks we want to incorporate as context. We can use more complex heuristics to rank these chunks down the line.

Describe alternatives you've considered

  • Copying and pasting code: This follows the assumption that user is actually 100% sure of what files they need to access in order to solve the issue. Fundamentally, this introduces a lot of friction and we want the knowledge retrieval step to be seamless.
  • More robust RAG algorithms: It goes without saying that we could use graph-based RAG approaches to weight documents by other heuristics but I think implementing the persistency and retrieval features first would allow us to further build more complex retrieval algorithms off of this.

I'm a huge fan of this extension, it's game changing! Even Copilot is quite irritating to use because it's RAG just uses context based on where the cursor is. Incorporating this feature would be game changing IMHO.

h3x4g0ns avatar Jun 17 '24 18:06 h3x4g0ns