raptor
raptor copied to clipboard
Inquiring about Vector DB implementation
Hi, thanks for the code.
I want to understand why any vector database is not implemented for storing the embeddings for fast retrieval as we do in conventional RAG.
I read in the paper that 'yes' the collapsed tree approach calculates the cosine similarity against all nodes, better approach is to use some fast k-nearest neighbor libraries such as FAISS. So my question is:
1- What were the considerations behind not integrating a vector database? Was there any benefit?
2- When recommending the adoption of k-nearest neighbor libraries, is the intention solely to substitute the existing cosine similarity search methodology? So that you don't need to run the search over all the nodes?
3- And how can I integrate this recommended library for retrieval with my answer_question
method?
Your insights on these matters would be greatly appreciated.
Thanks!