neo4j-graphrag-python
neo4j-graphrag-python copied to clipboard
Should `return_context` be enabled or not by default?
https://github.com/neo4j/neo4j-graphrag-python/blob/ae0831635d7ce39057a84afc1153e0b0dee015a9/src/neo4j_graphrag/generation/graphrag.py#L88
Part of the benefit of GraphRAG is providing better context for question answering.
When performing GraphRAG.search() the context from the retriever_result is missing, by default. Could we instead default to enabling return_context?
I personally have been doing this for almost a month now:
and all this time there was no need to change to false.
Old problem: How do you guys extract the context from the RetrieverResultItem (that you get when setting return_context to true)? It returns a content string instead of a dict, with the string not representing correct json, where i would have to manually parse out the index and more importantly the text, which is inconvenient. And inconsistent as doing retriever.get_search_results(query_text=question["question"], top_k=args.retrieve_topk) returns a list of Record nodes instead of a list of RetrieverResultItems, and those Record nodes have a dict with index and text being keys.
Edit: I didnt understand something yesterday that I do now: the content string in the RetrieverResultItem is the exact string that gets passed to the LLM. It is already processed by a result_formatter which you can also rewrite yourself. The result_formatter actually takes the aforementioned Record nodes as input to create that string.
Oh, I mixed up the dates, we have 2025 already. Thought this is new as i have seen the return_context default is being changed to true as per the in-code deprecation/warning notice.
closed by #191