graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

[Bug]: Updating index twice will result in a ValueError

Open masies opened this issue 9 months ago • 1 comments

Do you need to file an issue?

  • [x] I have searched the existing issues and this bug is not already filed.
  • [x] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • [x] I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

Running update_output twice throws ValueError: Could not find update_output/YYYYMMDD-HHMMSS/delta/communities.parquet in storage!

this issue is related to here

key = self._keyname(key) becomes (in my case) 'output\update_output\YYYYMMDD-HHMMSS\delta\communities.parquet' I guess this is because of the logic from the previous version

Steps to reproduce

  1. init the graphrag root dir,
  2. create the index
  3. add a file and update the index
  4. add another file and update the index

Expected Behavior

it should update normally the index once again, with both documents added in step 3 and 4 present in the index

GraphRAG Config Used

...
update_index_output:
    type: blob 
    provider: azure
    storage_account_blob_url: ${BLOB_STORAGE_URL}
    container_name: ${ROOT_DIR}
    base_dir: "update_output"
...

Logs and screenshots

No response

Additional Information

  • GraphRAG Version: 2.1.0
  • Operating System: win 11
  • Python Version: 3.14
  • Related Issues:

masies avatar Mar 18 '25 21:03 masies

11

xiao-hf avatar Mar 25 '25 00:03 xiao-hf