Chaning Input Files? How to do without having data from previous input file?

Open davidgross631 opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

[X] I have searched the existing issues
[x] I have checked #657 to validate if my issue is covered by community support

Describe the issue

I've been running GraphRAG and playing around with it this morning and yesterday and I have realized that there's an issue when it comes to loading new input files. Even by completely getting rid of the content for "A Christmas Carol", replacing it with a new file and/or new content, encoding it, and then running the pipeline again, it still generates knowledge graph data for "A Christmas Carol". What's even weirder is that it still has some data about the new content I put in the input file, but still has overwhelmingly the content from "A Christmas Carol". I've restarted and retried multiple times as well. Has this ever happened to any of y'all?

The reason I have data from "A Christmas Carol" is because I've been using this guide: https://microsoft.github.io/graphrag/posts/get_started/

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

Aug 09 '24 15:08 davidgross631