graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

Error in create_final_text_units workflow when using Vector Database

Open nievespg1 opened this issue 1 year ago • 0 comments

Problem Description

In line 144 of the create_final_text_units.py workflow the verb attempts to load the text_embedding column which does not exists in the dataframe when we define a vector database.

Solution

Copy the logic defined in the create_final_entities.py workflow, when we condition the verb on whether the embeddings are in a dataframe or a vector database.

See line 27 for an example of how to detect a vector store.

Notes

  • This line causes the below error https://github.com/microsoft/graphrag/blob/60a197fbd1421739542223fb78c86670b295f7c3/graphrag/index/workflows/v1/create_final_text_units.py#L144

    • error: datashaper.workflow.workflow ERROR Error executing verb "select" in create_final_text_units: "['text_embedding'] not in index"
  • This is a good condition to know whether the embeddings are in a dataframe or a vector database https://github.com/microsoft/graphrag/blob/60a197fbd1421739542223fb78c86670b295f7c3/graphrag/index/workflows/v1/create_final_entities.py#L27-L30

nievespg1 avatar Jun 26 '24 18:06 nievespg1