kotaemon
kotaemon copied to clipboard
[BUG] - The error occurred while using the graphrag collection feature
Description
The error occurred while using the graphrag collection feature
Reproduction steps
Upload files to graphrag and select the graphrag collection feature.
Screenshots
No response
Logs
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x7f3f078073a0>, FSPath=PosixPath('/app/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x7f3f07a40040>, get_extra_table=False, llm_scorer=None, mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0', use_key_from_ktem=True)], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x7f3fcec92320>, FSPath=<theflow.base.unset_ object at 0x7f3fcec92320>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x7f3fcec92320>, VS=<theflow.base.unset_ object at 0x7f3fcec92320>, file_ids=['60513354-be19-42c6-a4fb-b65887c2bbe7'], user_id=<theflow.base.unset_ object at 0x7f3fcec92320>)]
searching in doc_ids []
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 575, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 663, in async_iteration
return await iterator.__anext__()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 656, in __anext__
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async
return next(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 801, in gen_wrapper
response = next(iterator)
File "/app/libs/ktem/ktem/pages/chat/__init__.py", line 804, in chat_fn
for response in pipeline.stream(chat_input, conversation_id, chat_history):
File "/app/libs/ktem/ktem/reasoning/simple.py", line 655, in stream
docs, infos = self.retrieve(message, history)
File "/app/libs/ktem/ktem/reasoning/simple.py", line 483, in retrieve
retriever_docs = retriever_node(text=query)
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1097, in __call__
raise e from None
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1088, in __call__
output = self.fl.exec(func, args, kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/backends/base.py", line 151, in exec
return run(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 144, in __call__
raise e from None
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 141, in __call__
_output = self.next_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 117, in __call__
return self.next_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1017, in _runx
return self.run(*args, **kwargs)
File "/app/libs/ktem/ktem/index/file/graph/pipelines.py", line 321, in run
context_builder = self._build_graph_search()
File "/app/libs/ktem/ktem/index/file/graph/pipelines.py", line 198, in _build_graph_search
entity_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_TABLE}.parquet")
File "/usr/local/lib/python3.10/site-packages/pandas/io/parquet.py", line 667, in read_parquet
return impl.read(
File "/usr/local/lib/python3.10/site-packages/pandas/io/parquet.py", line 267, in read
path_or_handle, handles, filesystem = _get_path_or_handle(
File "/usr/local/lib/python3.10/site-packages/pandas/io/parquet.py", line 140, in _get_path_or_handle
handles = get_handle(
File "/usr/local/lib/python3.10/site-packages/pandas/io/common.py", line 882, in get_handle
handle = open(handle, ioargs.mode)
NotADirectoryError: [Errno 20] Not a directory: '/app/ktem_app_data/user_data/files/graphrag/15f966fc-a057-4bb7-b308-8a007cce8110/output/stats.json/artifacts/create_final_nodes.parquet'
Browsers
Microsoft Edge
OS
Linux
Additional information
No response