graphrag-accelerator icon indicating copy to clipboard operation
graphrag-accelerator copied to clipboard

[BUG] Index Building failed

Open Jaysagar07 opened this issue 8 months ago • 2 comments

Describe the bug Depoyment completed successfully. Facing issue while building an index. Index completion stucks at 18% and then fails. log error: { 'type': 'on_workflow_start', 'data': 'Index: book-graph-index1 -- Workflow (1/11): create_base_text_units started.', 'details': { 'workflow_name': 'create_base_text_units', 'index_name': 'book-graph-index1', }, } { 'type': 'on_workflow_end', 'data': 'Index: book-graph-index1 -- Workflow (1/11): create_base_text_units complete.', 'details': { 'workflow_name': 'create_base_text_units', 'index_name': 'book-graph-index1', }, } { 'type': 'on_workflow_start', 'data': 'Index: book-graph-index1 -- Workflow (2/11): create_final_documents started.', 'details': { 'workflow_name': 'create_final_documents', 'index_name': 'book-graph-index1', }, } { 'type': 'on_workflow_end', 'data': 'Index: book-graph-index1 -- Workflow (2/11): create_final_documents complete.', 'details': { 'workflow_name': 'create_final_documents', 'index_name': 'book-graph-index1', }, } { 'type': 'on_workflow_start', 'data': 'Index: book-graph-index1 -- Workflow (3/11): extract_graph started.', 'details': { 'workflow_name': 'extract_graph', 'index_name': 'book-graph-index1', }, } { 'type': 'error', 'data': 'Error Invoking LLM', 'cause': "Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}", 'stack': ( 'Traceback (most recent call last):\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/base/base.py", line 112, in call\n' ' return await self._invoke(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/base/base.py", line 128, in _invoke\n' ' return await self._decorated_target(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/json.py", line 71, in invoke\n' ' return await delegate(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/retryer.py", line 109, in invoke\n' ' result = await execute_with_retry()\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/retryer.py", line 93, in execute_with_retry\n' ' async for a in AsyncRetrying(\n' ' File "/usr/local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 166, in anext\n' ' do = await self.iter(retry_state=self._retry_state)\n' ' File "/usr/local/lib/python3.10/site-packages/tenacity/asyncio/init.py", line 153, in iter\n' ' result = await action(retry_state)\n' ' File "/usr/local/lib/python3.10/site-packages/tenacity/_utils.py", line 99, in inner\n' ' return call(*args, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/tenacity/init.py", line 400, in \n' ' self._add_action_func(lambda rs: rs.outcome.result())\n' ' File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result\n' ' return self.__get_result()\n' ' File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result\n' ' raise self._exception\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/retryer.py", line 101, in execute_with_retry' '\n' ' return await attempt()\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/retryer.py", line 78, in attempt\n' ' return await delegate(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/rate_limiter.py", line 70, in invoke\n' ' result = await delegate(prompt, **args)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/json.py", line 71, in invoke\n' ' return await delegate(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/base/base.py", line 152, in _decorator_target\n' ' output = await self._execute_llm(prompt, **kwargs)\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/openai/llm/chat_text.py", line 155, in _execute_llm\n' ' completion = await self._call_completion_or_cache(\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/openai/llm/chat_text.py", line 127, in _call_completi' 'on_or_cache\n' ' return await self._cache.get_or_insert(\n' ' File "/usr/local/lib/python3.10/site-packages/fnllm/services/cache_interactor.py", line 50, in get_or_ins' 'ert\n' ' entry = await func()\n' ' File "/usr/local/lib/python3.10/site-packages/openai/resources/chat/completions/completions.py", line 200' '0, in create\n' ' return await self._post(\n' ' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1767, in post\n' ' return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)\n' ' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1461, in request\n' ' return await self._request(\n' ' File "/usr/local/lib/python3.10/site-packages/openai/_base_client.py", line 1562, in _request\n' ' raise self._make_status_error_from_response(err.response) from None\n' "openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}\n"

To Reproduce 1-Quickstart.ipynb def build_index( storage_name: str, index_name: str, ) -> requests.Response: """Create a search index. This function kicks off a job that builds a knowledge graph index from files located in a blob storage container. """ url = endpoint + "/index" return requests.post( url, params={ "index_container_name": index_name, "storage_container_name": storage_name, }, headers=headers, ) response = build_index(storage_name=storage_name, index_name=index_name) print(response) if response.ok: print(response.text) else: print(f"Failed to submit job.\nStatus: {response.text}")

#Check status of indexing job def index_status(index_name: str) -> requests.Response: url = endpoint + f"/index/status/{index_name}" return requests.get(url, headers=headers)

response = index_status(index_name) pprint(response.json())

Expected behavior Indexing should complete

Additional context deployment parameters: AI_SEARCH_AUDIENCE="https://search.azure.com" AISEARCH_ENDPOINT_SUFFIX="search.windows.net" APIM_NAME="" APIM_TIER="Developer" CLOUD_NAME="AzurePublicCloud" GRAPHRAG_IMAGE="graphrag:backend" PUBLISHER_EMAIL="[email protected]" PUBLISHER_NAME="publisher" RESOURCE_BASE_NAME="" COGNITIVE_SERVICES_AUDIENCE="https://cognitiveservices.azure.com/.default" CONTAINER_REGISTRY_LOGIN_SERVER="" GRAPHRAG_API_BASE="" GRAPHRAG_API_VERSION="2023-03-15-preview" GRAPHRAG_LLM_MODEL="gpt-4" GRAPHRAG_LLM_MODEL_VERSION="turbo-2024-04-09" GRAPHRAG_LLM_DEPLOYMENT_NAME="gpt-4" GRAPHRAG_LLM_MODEL_QUOTA="80" GRAPHRAG_EMBEDDING_MODEL="text-embedding-ada-002" GRAPHRAG_EMBEDDING_MODEL_VERSION="2" GRAPHRAG_EMBEDDING_DEPLOYMENT_NAME="text-embedding-ada-002" GRAPHRAG_EMBEDDING_MODEL_QUOTA="300"

Jaysagar07 avatar Apr 25 '25 06:04 Jaysagar07

I'm encountering the same issue where the indexing job consistently fails at 18% during the extract_graph workflow step. The job successfully completes the create_base_text_units and create_final_documents steps, but then fails with the following error:

openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

I deployed the solution using the provided scripts, and the deployment completed without errors. I uploaded .txt files using the upload_files() function in the notebook, which returned a 200 OK. I then triggered indexing with build_index(), which responded with Indexing job scheduled.

The model names and deployment names (e.g., gpt-4, text-embedding-ada-002) in my Azure OpenAI resource match what's set in the environment, and both are successfully deployed. However, it's unclear if the backend is referencing them correctly during the LLM invocation.

This seems like a misconfiguration or missing environment variable in the deployed app. Any advice on how to verify which AOAI deployment is actually being called at runtime—or how to debug this further—would be much appreciated.

mimimamalah avatar May 05 '25 15:05 mimimamalah

I got the same error. The progress stops at 18.18%. I see

"openai.NotFoundError: Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}\n"

in the logs.

lsukharn avatar May 20 '25 14:05 lsukharn