[Bug]: Context Reference section does not correctly display file_id or file_path
Do you need to file an issue?
- [x] I have searched the existing issues and this bug is not already filed.
- [x] I believe this is a legitimate bug, not just a question or feature request.
Describe the bug
The references provided in the response do not properly display the file_id or source.
Steps to reproduce
Send a query using any of the mode settings and verify the response reference section.
Expected Behavior
The file references should correctly indicate the filename or source.
LightRAG Config Used
Server Configuration
HOST=0.0.0.0 PORT=9621
NAMESPACE_PREFIX=lightrag # separating data from difference Lightrag instances
CORS_ORIGINS=http://localhost:3000,http://localhost:8080
Optional SSL Configuration
SSL=true
SSL_CERTFILE=/path/to/cert.pem
SSL_KEYFILE=/path/to/key.pem
Security (empty for no api-key is needed)
LIGHTRAG_API_KEY=
LLM_BINDING="azure_openai" LLM_BINDING_HOST= LLM_MODEL="gpt-4o" LLM_BINDING_API_KEY=
EMBEDDING_BINDING="azure_openai" EMBEDDING_MODEL=text-embedding-3-large
AZURE_OPENAI_API_VERSION="2024-10-21" AZURE_OPENAI_DEPLOYMENT="gpt-4o" AZURE_OPENAI_API_KEY= AZURE_OPENAI_ENDPOINT=
Settings for document indexing
CHUNK_SIZE=1200 CHUNK_OVERLAP_SIZE=100 MAX_TOKENS=32768 # Max tokens send to LLM for summarization MAX_TOKEN_SUMMARY=500 # Max tokens for entity or relations summary SUMMARY_LANGUAGE=English EMBEDDING_DIM=3072
Logging level
LOG_LEVEL=INFO VERBOSE=True
Max async calls for LLM
MAX_ASYNC=8
Optional Timeout for LLM
TIMEOUT=90 # Time out in seconds, None for infinite timeout
Settings for RAG query
HISTORY_TURNS=3 COSINE_THRESHOLD=0.2 TOP_K=20 MAX_TOKEN_TEXT_CHUNK=20000 MAX_TOKEN_RELATION_DESC=4000 MAX_TOKEN_ENTITY_DESC=4000
WORKING_DIR=\sgnt-dev-sgvt01\d$\VALI_DB
Data storage selection
LIGHTRAG_KV_STORAGE=JsonKVStorage LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage LIGHTRAG_GRAPH_STORAGE=NetworkXStorage LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage
Logs and screenshots
Additional Information
- LightRAG Version: 1.2.6
- Operating System: Windows 11
- Python Version: 3.12.5
- Related Issues:
The screenshots were taken one for each of the modes. From top to bottom: global, hybrid, local, mix, naive
This is taken from the WebUI document management tab. I think this may be related.
The error message started appearing after updating lightrag to 1.2.6:
And here is the corresponding error log from the LightRAG server:
INFO: 77.3.20.100:52839 - "GET /documents HTTP/1.1" 500 ERROR: Error GET /documents: DocProcessingStatus.init() missing 1 required positional argument: 'file_path' ERROR: Traceback (most recent call last): File "C:\DEV\LightRAG_SVR_Vali\lightrag\api\routers\document_routes.py", line 827, in documents results: List[Dict[str, DocProcessingStatus]] = await asyncio.gather(*tasks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\DEV\LightRAG_SVR_Vali\lightrag\lightrag.py", line 1537, in get_docs_by_status return await self.doc_status.get_docs_by_status(status) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\DEV\LightRAG_SVR_Vali\lightrag\kg\json_doc_status_impl.py", line 90, in get_docs_by_status result[k] = DocProcessingStatus(**data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: DocProcessingStatus.init() missing 1 required positional argument: 'file_path'