[Feature Request]: <title>Suggestions for Implementing Traceable Query Features
Do you need to file an issue?
- [x] I have searched the existing issues and this feature is not already filed.
- [ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [x] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.
Is your feature request related to a problem? Please describe.
Introduction and Gratitude: Graphrag has demonstrated remarkable adaptability, robustness, and efficiency in handling multi-text queries. Through this project, I have significantly enhanced the efficiency of our department's software development, receiving high praise during customer deliveries. Over time, as project files have accumulated to a substantial level of tens of millions of words (approximately 13,900 books), the advantages of Graphrag have become even more apparent, offering many novel insights and perspectives, which have greatly impressed us.
Everyone knows that innovation is extremely challenging, and we hope to capture these shining moments. We hope the official team can introduce a feature for tracing responses based on Graphrag. For each detailed response (especially when using OpenAI GPT-o1 for discussions), if it could be traced back to the original text, chapter, or other related original content on a sentence-by-sentence basis, it would potentially be more helpful for usage and verification.
Describe the solution you'd like
Setting up traceable indexes during the construction process? Or adding more information in summaries to enable cross-referencing within the community? I haven't thought of more ideas yet.
Additional context
No response
Hi @shaoqing404, I've also been on the lookout for something similar.
In the meantime (admittedly on a much smaller scale than you describe) I have used post-processing scripts to create a mapping from the responses back to the original sources (via entities, relationships and text snippets) using the response objects. It's far from perfect but has been very useful for corroborating outputs for research. You can check out the scripts at this repository, in case they are helpful in any way.