graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

create_final_covariates.parquet not generated[Bug]: <title>

Open win4r opened this issue 1 year ago • 4 comments

Describe the bug

create_final_covariates.parquet not generated

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

No response

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:

win4r avatar Jul 10 '24 23:07 win4r

Hi! Please provide more information about the issue. Can you please add the config used?

AlonsoGuevara avatar Jul 11 '24 01:07 AlonsoGuevara

if you're using the default indexing pipeline it does not generate a create_final_covariates.parquet anymore. Set the covariates to None or don't include it when defining a LocalSearchMixedContext class.

zanderjiang avatar Jul 11 '24 02:07 zanderjiang

if you're using the default indexing pipeline it does not generate a create_final_covariates.parquet anymore. Set the covariates to [] or don't include it when defining a LocalSearchMixedContext class.

What happens if I don't include covariates? How do I generate covariates?"

win4r avatar Jul 11 '24 02:07 win4r

if you're using the default indexing pipeline it does not generate a create_final_covariates.parquet anymore. Set the covariates to [] or don't include it when defining a LocalSearchMixedContext class.

What happens if I don't include covariates? How do I generate covariates?"

Covariates are claims associated with the extracted entities. I'm not entirely sure why they decided to disable the covariate file. I'm guessing it's because of reducing LLM calls? GraphRAG is very expensive to run even on small indexing tasks.

Anyways, if you want to generate covariates, you can set GRAPHRAG_CLAIM_EXTRACTION_ENABLED to True in the .env file in your data project root. It should then generate the create_final_covariates.parquet file.

zanderjiang avatar Jul 11 '24 02:07 zanderjiang

Thanks for your help @zanderjiang Absolutely correct.

Also, @win4r you can turn it on here:

claim_extraction:
  ## llm: override the global llm settings for this task
  ## parallelization: override the global parallelization settings for this task
  ## async_mode: override the global async_mode settings for this task
  enabled: true
  prompt: "prompts/claim_extraction.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

Just uncomment the enabled line in your settings.yaml file. I'll resolve the issue, but please reopen if this doesn't work

AlonsoGuevara avatar Jul 11 '24 23:07 AlonsoGuevara

After enable claim_extraction, I got covariate_type keyerror in the step "create final covariates"

mavershang avatar Jul 30 '24 12:07 mavershang

启用claim_extraction后,我在“create final covariates”步骤中收到了covariate_type keyerror

how do you solve this problem

yanning169 avatar Sep 08 '24 08:09 yanning169