GraphRAG - vector search
Thanks for adding GraphRAG to RAGbuilder.
I had some questions and suggestions, perhaps you want to chat some time.
- QQ: in graphrag.full_retriever you fetch the vector store data but don't use it in the method or the returns, looks redundant?
def full_retriever(question: str):
graph_data = graph_retriever(question)
vector_data = [el.page_content for el in vector_retriever.invoke(question)]
final_data = f'''Graph data:
{graph_data}
'''
return final_data
- You don't make use of the built in neo4j vector search only the fulltext index - with the vector search you can allow in-graph vector and hybrid search? (you can create vector indexes both for chunks in the lexical graph, for entities in the domain graph and for communities in the topical structures)
- right now the graph retriever only uses the direct neighbourhood of the nodes, this could be a good hyperparameter to add
- e.g. we have a number of different retrievers in the llm-graph-builder, see: https://github.com/neo4j-labs/llm-graph-builder/blob/DEV/backend/src/shared/constants.py
- I saw you copied some code from the neo4j-langchain integrations? Was there a reason (i.e. did you make modifications - if so it might be good to discuss to rather contribute them back upstream?)
- there is the option to run clustering algorithms to generate cross-document topic summaries across the entity graphs (like in the MSFT GraphRAG paper), see https://neo4j.com/developer-blog/global-graphrag-neo4j-langchain/ (we've also implemented that in https://llm-graph-builder.neo4jlabs.com if you have a graph data science enabled database).
We have documented more GraphRAG patterns, here just in case you want to share your RAG patterns to the catalogue or provide some feedback:
- https://neo4j.com/developer-blog/graphrag-field-guide-rag-patterns/
- https://graphr.rag
Hi @jexp, thanks for your questions & thoughts!
@ashwinzyx - perhaps, you can take a look once you're back.
Hi @jexp, thanks for looking at our repo. Apologies for the delay. Just got back from vacation.
-
QQ: in graphrag.full_retriever you fetch the vector store data but don't use it in the method or the returns, looks redundant?
def full_retriever(question: str): graph_data = graph_retriever(question) vector_data = [el.page_content for el in vector_retriever.invoke(question)] final_data = f'''Graph data: {graph_data} ''' return final_data
[Ans] Yes. Looks like we are not using vector_data for the Graph RAG but using it for the Hybrid RAG. Will remove it
- You don't make use of the built in neo4j vector search only the fulltext index - with the vector search you can allow in-graph vector and hybrid search? (you can create vector indexes both for chunks in the lexical graph, for entities in the domain graph and for communities in the topical structures)
[Ans] We have been using Chroma for the templates for vector search. I do see hybrid search options in below examples. https://python.langchain.com/docs/integrations/vectorstores/neo4jvector/ https://neo4j.com/labs/genai-ecosystem/langchain/
believe below is using in-graph vector. Am i right? Is there an full example you can share for in-graph vector https://neo4j.com/developer-blog/global-graphrag-neo4j-langchain/
- right now the graph retriever only uses the direct neighbourhood of the nodes, this could be a good hyperparameter to add
[Ans] For now we have added GraphRAG as a template. We will include these are individual components and have hyperparameter tuning option
e.g. we have a number of different retrievers in the llm-graph-builder, see: https://github.com/neo4j-labs/llm-graph-builder/blob/DEV/backend/src/shared/constants.py
[Ans] Thanks for the pointer. Will take a look
I saw you copied some code from the neo4j-langchain integrations? Was there a reason (i.e. did you make modifications - if so it might be good to discuss to rather contribute them back upstream?)
[Ans] No. We did not make any modifications.
there is the option to run clustering algorithms to generate cross-document topic summaries across the entity graphs (like in the MSFT GraphRAG paper), see https://neo4j.com/developer-blog/global-graphrag-neo4j-langchain/ (we've also implemented that in https://llm-graph-builder.neo4jlabs.com/ if you have a graph data science enabled database).
[Ans] Thanks. Will take a look.
Thanks for all your feedback. Would be great to chat sometime. We want the improve GraphRAG option in RAGBuilder and would love your contributions as well
@jexp - can you pls review @ashwinzyx's comments? Do you have any further thoughts or suggestions? Please feel free to suggest changes or raise a PR to make the Graph RAG part of RAGBuilder even better.
@aravind10x would probably good to have a chat with me and @tomasonjo at some point, harder to go through these in GH issues :)