graphrag Helloworld walkthrough failing on entity graph creation

When running create_base_extracted_entities, the entity extraction seems to work fine (per checking the cache), but when the merge_graph stage runs, it fails silently.

Other than a bunch of "error invoking LLM" from having too many threads, these are the only interesting log file entries:

When running create_base_extracted_entities, the entity extraction seems to work fine (per checking the cache), but when the merge_graph stage runs, it fails silently.

Other than a bunch of "error invoking LLM" from having too many threads, these are the only interesting log file entries:

{"type": "error", "data": "Error Invoking LLM", "stack": "Traceback (most recent call last):\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\base_llm.py\", line 57, in _invoke\n    output = await self._execute_llm(input, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\openai\\openai_chat_llm.py\", line 55, in _execute_llm\n    completion = await self.client.chat.completions.create(\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\resources\\chat\\completions.py\", line 1289, in create\n    return await self._post(\n           ^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1805, in post\n    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1503, in request\n    return await self._request(\n           ^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1599, in _request\n    raise self._make_status_error_from_response(err.response) from None\nopenai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 48 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}\n", "source": "Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 48 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}", "details": {"input": "\n-Goal-\nGiven a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.\n\n-Steps-\n1. Identify all entities. For each identified entity, extract the following information:\n- entity_name: Name of the entity, capitalized\n- entity_type: One of the following types: [person, location, organization, document, event, relationship, object]\n- entity_description: Comprehensive description of the entity's attributes and activities\nFormat each entity as (\"entity\"<|><entity_name><|><entity_type><|><entity_description>\n\n2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.\nFor each pair of related entities, extract the following information:\n- source_entity: name of the source entity, as identified in step 1\n- target_entity: name of the target entity, as identified in step 1\n- relationship_description: explanation as to why you think the source entity and the target entity are related to each other\n- relationship_strength: an integer score between 1 to 10, indicating strength of the relationship between the source entity and target entity\n\nFormat each relationship as (\"relationship\"<|><source_entity><|><target_entity><|><relationship_description><|><relationship_strength>)\n\n3. Return output in English as a single list of all the entities and relationships identified in steps 1 and 2. Use **##** as the list delimiter.\n\n4. When finished, output <|COMPLETE|>\n\n-Examples-\n######################\n\nExample 1:\n\nentity_types: [person, location, organization, document, event, relationship, object]\ntext:\n were sonorous and melancholy. Occasionally they were fantastic and cheerful. Clearly they reflected the thoughts which possessed him, but whether the music aided those thoughts, or whether the playing was simply the result of a whim or fancy was more than I could determine. I might have rebelled against these ex- asperating solos had it not been that he usually terminated them by playing in quick succession a whole series of my favourite airs as a slight com- pensation for the trial upon my patience. \nDuring the first week or so we had no callers, and I had begun to think that my companion was as friendless a man as I was myself. Presently, however, I found that he had many acquaintances, and those in the most different classes of society. There was one little sallow rat-faced, dark-eyed fel- low who was introduced to me as Mr. Lestrade, and who came three or four times in a single week. One morning a young girl called\n------------------------\noutput:\n**Entities:**\n\n(\"entity\"{\"tuple_delimiter\"}\"Mr. Lestrade\"{\"tuple_delimiter\"}\"person\"{\"tuple_delimiter\"}\"A little sallow rat-faced, dark-eyed fellow who visited three or four times in a single week.\")\n\n\n**Relationships:**\n\n(\"relationship\"{\"tuple_delimiter\"}\"Mr. Lestrade\"{\"tuple_delimiter\"}\"companion\"{\"tuple_delimiter\"}\"Mr. Lestrade is an acquaintance of the narrator's companion.\"{\"tuple_delimiter\"}6)\n\n<|COMPLETE|>\n#############################\n\n\nExample 2:\n\nentity_types: [person, location, organization, document, event, relationship, object]\ntext:\n and spread out the documents upon his knees. Then he lit his pipe and sat for some time smoking and turning them over. \n\"You never heard me talk of Victor Trevor?\" he asked. \"He was the only friend I made during the two years I was at college. I was never a very socia- ble fellow, Watson, always rather fond of moping in my rooms and working out my own little meth- ods of thought, so that I never mixed much with the men of my year. Bar fencing and boxing I had few athletic tastes, and then my line of study was quite distinct from that of the other fellows, so that we had no points of contact at all. Trevor was the \n\n\nonly man I knew, and that only through the acci- dent of his bull terrier freezing on to my ankle one morning as I went down to chapel. \n\"It was a prosaic way of forming a friendship, but it was effective. I was laid by the heels\n------------------------\noutput:\n**Entities:**\n\n(\"entity\"{\"tuple_delimiter\"}\"Victor Trevor\"{\"tuple_delimiter\"}\"person\"{\"tuple_delimiter\"}\"Victor Trevor is the only friend the narrator made during his two years at college. Their friendship began when Trevor's bull terrier bit the narrator's ankle.\"}\n\n(\"entity\"{\"tuple_delimiter\"}\"Watson\"{\"tuple_delimiter\"}\"person\"{\"tuple_delimiter\"}\"Watson is the person being spoken to by the narrator. He is a companion and confidant of the narrator.\"}\n\n(\"entity\"{\"tuple_delimiter\"}\"college\"{\"tuple_delimiter\"}\"location\"{\"tuple_delimiter\"}\"The college is the place where the narrator spent two years and met Victor Trevor.\"}\n\n(\"entity\"{\"tuple_delimiter\"}\"chapel\"{\"tuple_delimiter\"}\"location\"{\"tuple_delimiter\"}\"The chapel is the location where the narrator was heading when he encountered Victor Trevor's bull terrier.\"}\n\n(\"entity\"{\"tuple_delimiter\"}\"bull terrier\"{\"tuple_delimiter\"}\"object\"{\"tuple_delimiter\"}\"The bull terrier is the dog belonging to Victor Trevor that bit the narrator's ankle, leading to their friendship.\"}\n\n**Relationships:**\n\n(\"relationship\"{\"tuple_delimiter\"}\"Victor Trevor\"{\"tuple_delimiter\"}\"bull terrier\"{\"tuple_delimiter\"}\"The bull terrier belongs to Victor Trevor and was the catalyst for the friendship between Victor Trevor and the narrator.\"{\"tuple_delimiter\"}8)\n\n(\"relationship\"{\"tuple_delimiter\"}\"Victor Trevor\"{\"tuple_delimiter\"}\"college\"{\"tuple_delimiter\"}\"Victor Trevor attended the same college as the narrator, where they became friends.\"{\"tuple_delimiter\"}7)\n\n(\"relationship\"{\"tuple_delimiter\"}\"narrator\"{\"tuple_delimiter\"}\"college\"{\"tuple_delimiter\"}\"The narrator spent two years at the college, where he met Victor Trevor.\"{\"tuple_delimiter\"}7)\n\n(\"relationship\"{\"tuple_delimiter\"}\"narrator\"{\"tuple_delimiter\"}\"Victor Trevor\"{\"tuple_delimiter\"}\"Victor Trevor is the only friend the narrator made during his two years at college.\"{\"tuple_delimiter\"}9)\n\n(\"relationship\"{\"tuple_delimiter\"}\"narrator\"{\"tuple_delimiter\"}\"chapel\"{\"tuple_delimiter\"}\"The narrator was heading to the chapel when he encountered Victor Trevor's bull terrier.\"{\"tuple_delimiter\"}6)\n\n(\"relationship\"{\"tuple_delimiter\"}\"narrator\"{\"tuple_delimiter\"}\"Watson\"{\"tuple_delimiter\"}\"Watson is the person being spoken to by the narrator, indicating a close relationship.\"{\"tuple_delimiter\"}8)\n\n**<|COMPLETE|>**\n#############################\n\n\n\n-Real Data-\n######################\nentity_types: [person, location, organization, document, event, relationship, object]\ntext: - lish Sir Robert in a fair position in life. Both po- lice and coroner took a lenient view of the trans- action, and beyond a mild censure for the delay in registering the lady's decease, the lucky owner got away scatheless from this strange incident in a career which has now outlived its shadows and promises to end in an honoured old age. \n\n\nThe Adventure of the Retired Colourman \n\n\nThe Adventure of the Retired Colourman \n\n\n\nout?' \n\n\nherlock Holmes was in a melancholy and philosophic mood that morning. His alert practical nature was subject to such reactions. \n'Did you see him?\" he asked. \n'You mean the old fellow who has just gone \n\"Precisely.\" \n\"Yes, I met him at the door. \" \n\"What did you think of him?\" \n\"A pathetic, futile, broken creature.\" \n\"Exactly, Watson. Pathetic and futile. But is not all life pathetic and futile? Is not his story a micro- cosm of the whole? We reach. We grasp. And what is left in our hands at the end? A shadow. Or worse than a shadow \u2014 misery.\" \n\"Is he one of your clients?\" \n\"Well, I suppose I may call him so. He has been sent on by the Yard. Just as medical men occasion- ally send their incurables to a quack. They argue that they can do nothing more, and that whatever\n######################\noutput:"}}
{"type": "error", "data": "Entity Extraction Error", "stack": "Traceback (most recent call last):\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\index\\graph\\extractors\\graph\\graph_extractor.py\", line 118, in __call__\n    result = await self._process_document(text, prompt_variables)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\index\\graph\\extractors\\graph\\graph_extractor.py\", line 146, in _process_document\n    response = await self._llm(\n               ^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\openai\\json_parsing_llm.py\", line 34, in __call__\n    result = await self._delegate(input, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\openai\\openai_token_replacing_llm.py\", line 37, in __call__\n    return await self._delegate(input, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\openai\\openai_history_tracking_llm.py\", line 33, in __call__\n    output = await self._delegate(input, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\caching_llm.py\", line 104, in __call__\n    result = await self._delegate(input, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\rate_limiting_llm.py\", line 177, in __call__\n    result, start = await execute_with_retry()\n                    ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\rate_limiting_llm.py\", line 159, in execute_with_retry\n    async for attempt in retryer:\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\tenacity\\asyncio\\__init__.py\", line 166, in __anext__\n    do = await self.iter(retry_state=self._retry_state)\n         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\tenacity\\asyncio\\__init__.py\", line 153, in iter\n    result = await action(retry_state)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\tenacity\\_utils.py\", line 99, in inner\n    return call(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\tenacity\\__init__.py\", line 418, in exc_check\n    raise retry_exc.reraise()\n          ^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\tenacity\\__init__.py\", line 185, in reraise\n    raise self.last_attempt.result()\n          ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Python311\\Lib\\concurrent\\futures\\_base.py\", line 449, in result\n    return self.__get_result()\n           ^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Python311\\Lib\\concurrent\\futures\\_base.py\", line 401, in __get_result\n    raise self._exception\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\rate_limiting_llm.py\", line 165, in execute_with_retry\n    return await do_attempt(), start\n           ^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\rate_limiting_llm.py\", line 151, in do_attempt\n    await sleep_for(sleep_time)\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\rate_limiting_llm.py\", line 147, in do_attempt\n    return await self._delegate(input, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\base_llm.py\", line 53, in __call__\n    return await self._invoke(input, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\base\\base_llm.py\", line 57, in _invoke\n    output = await self._execute_llm(input, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\llm\\openai\\openai_chat_llm.py\", line 55, in _execute_llm\n    completion = await self.client.chat.completions.create(\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\resources\\chat\\completions.py\", line 1289, in create\n    return await self._post(\n           ^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1805, in post\n    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1503, in request\n    return await self._request(\n           ^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\openai\\_base_client.py\", line 1599, in _request\n    raise self._make_status_error_from_response(err.response) from None\nopenai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 52 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}\n", "source": "Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 52 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}", "details": {"doc_index": 0, "text": "ly premeditated, then the means of covering it are coolly premeditated also. I hope, therefore, that we are in the presence of a serious misconception.\" \n\"But there is so much to explain.\" \n\"Well, we shall set about explaining it. When once your point of view is changed, the very thing which was so damning becomes a clue to the truth. For example, there is this revolver. Miss Dunbar disclaims all knowledge of it. On our new theory she is speaking truth when she says so. There- fore, it was placed in her wardrobe. Who placed it there? Someone who wished to incriminate her. Was not that person the actual criminal? You see how we come at once upon a most fruitful line of inquiry.\" \nWe were compelled to spend the night at Winchester, as the formalities had not yet been completed, but next morning, in the company of Mr. Joyce Cummings, the rising barrister who was entrusted with the defence, we were allowed to see the young lady in her cell. I had expected from all that we had heard to see a beautiful woman, but I can never forget the effect which Miss Dunbar pro- duced upon me. It was no wonder that even the masterful millionaire had found in her something more powerful than himself \u2014 something which could control and guide him. One felt, too, as one looked at the strong, clear-cut, and yet sensitive face, that even should she be capable"}}

Then, there are these log entries that I think are just the consequence of some failure above?

{"type": "error", "data": "Error executing verb \"cluster_graph\" in create_base_entity_graph: Columns must be same length as key", "stack": "Traceback (most recent call last):\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\datashaper\\workflow\\workflow.py\", line 410, in _execute_verb\n    result = node.verb.func(**verb_args)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\index\\verbs\\graph\\clustering\\cluster_graph.py\", line 102, in cluster_graph\n    output_df[[level_to, to]] = pd.DataFrame(\n    ~~~~~~~~~^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\frame.py\", line 4299, in __setitem__\n    self._setitem_array(key, value)\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\frame.py\", line 4341, in _setitem_array\n    check_key_length(self.columns, key, value)\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\indexers\\utils.py\", line 390, in check_key_length\n    raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}
{"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\index\\run.py\", line 323, in run_pipeline\n    result = await workflow.run(context, callbacks)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\datashaper\\workflow\\workflow.py\", line 369, in run\n    timing = await self._execute_verb(node, context, callbacks)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\datashaper\\workflow\\workflow.py\", line 410, in _execute_verb\n    result = node.verb.func(**verb_args)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphrag\\index\\verbs\\graph\\clustering\\cluster_graph.py\", line 102, in cluster_graph\n    output_df[[level_to, to]] = pd.DataFrame(\n    ~~~~~~~~~^^^^^^^^^^^^^^^^\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\frame.py\", line 4299, in __setitem__\n    self._setitem_array(key, value)\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\frame.py\", line 4341, in _setitem_array\n    check_key_length(self.columns, key, value)\n  File \"C:\\Users\\steventruitt\\source\\repos\\graphrag\\graphragtest\\Lib\\site-packages\\pandas\\core\\indexers\\utils.py\", line 390, in check_key_length\n    raise ValueError(\"Columns must be same length as key\")\nValueError: Columns must be same length as key\n", "source": "Columns must be same length as key", "details": null}

The stats.json shows this, so I'm pretty confident that it fails on the merge_graphs entity graph creation:

{
    "total_runtime": 10167.02763581276,
    "num_documents": 1,
    "input_load_time": 0,
    "workflows": {
        "create_base_text_units": {
            "overall": 2.7237725257873535,
            "0_orderby": 0.0060007572174072266,
            "1_zip": 0.006998538970947266,
            "2_aggregate_override": 0.008006811141967773,
            "3_chunk": 2.4501953125,
            "4_select": 0.006997585296630859,
            "5_unroll": 0.01601433753967285,
            "6_rename": 0.008991479873657227,
            "7_genid": 0.11474370956420898,
            "8_unzip": 0.007005214691162109,
            "9_copy": 0.007086038589477539,
            "10_filter": 0.08634305000305176
        },
        "create_base_extracted_entities": {
            "overall": 10163.398600578308,
            "0_entity_extract": 10163.042577505112,
            "1_merge_graphs": 0.3440265655517578
        },
        "create_summarized_entities": {
            "overall": 0.0209963321685791,
            "0_summarize_descriptions": 0.012000083923339844
        }
    }
}

Jun 28 '24 14:06 stevetru1

I've successfully blown through this issue using a different dataset so I can isolate that there is an issue with the text file I was using, my suspicion is that there is something to do with the text formatting of that file. Will update as more info found.

Jun 28 '24 15:06 stevetru1

I resolved this by converting the text encoding from 'US-ASCII' to 'UTF-8'.

Jul 05 '24 12:07 sohit3832

My rule of thumb for when running into issues with cluster graph is checking if entity extraction was actually successful. When inspecting at the cache files, you should see more than just the prompt.

Generally a failure while clustering is a sympton of 0 extracted entities.

Will close this as it seems you have it working now, but please reopen if needed :)

Jul 06 '24 02:07 AlonsoGuevara