chatcraft.org
chatcraft.org copied to clipboard
Explore using RAG in ChatCraft
As we start to consider adding the ability to attach files to a chat (see #325), we're going to run into cases where the context window of the chat is not enough to fit everything we have. Consider a PDF of a paper, or a zip file with a bunch of files and code that you want to ask questions about.
RAG (Retrieval Augmentation Generation) is a way to use a large piece of context (e.g., a big document, a database, etc.) to "retrieve" relevant chunks of context, then include those along with your prompt. For example, if I had a zip file of a source code project, I might only need to include 5 or 6 chunks of code with my question vs. the whole thing. RAG techniques allow you to find chunks of text that are similar to what you are talking about within a larger document/database.
I was reminded of this reading Twitter today:
It's possible to generate embeddings in a browser (free), or have OpenAI ($$$) do it for you:
- https://js.langchain.com/docs/integrations/text_embedding/tensorflow
- https://developers.google.com/mediapipe/solutions/text/text_embedder/web_js
- https://hub.superlinked.com/vector-embeddings-in-the-browser
- https://platform.openai.com/docs/guides/embeddings
RAG probably isn't the main way we'd use ChatCraft; but given that we have a database of all chats, and the ability to include files, we should probably explore whether we can leverage this for our use cases.