[Bug]:PostgreSQL restart breaks LightRAG permanently — no auto-reconnect & /health still returns OK
Do you need to file an issue?
- [x] I have searched the existing issues and this bug is not already filed.
- [x] I believe this is a legitimate bug, not just a question or feature request.
Describe the bug
When using LightRAG with PostgreSQL storage (PGKVStorage, PGGraphStorage, PGVectorStorage, PGDocStatusStorage), the application does not recover after PostgreSQL restarts.
- The LightRAG container continues running
- /health endpoint still returns 200
- But every RAG operation fails internally because the PostgreSQL connection pool is stale
The only way to recover is to restart/redeploy the LightRAG container manually.
This makes LightRAG not resilient to database restarts or failovers.
Steps to reproduce
Deploy LightRAG using the official image:
ghcr.io/hkuds/lightrag:latest
Configure PostgreSQL using .env:
LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_GRAPH_STORAGE=PGGraphStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage
Start LightRAG (Docker, ECS, or Compose)
Restart PostgreSQL:
docker restart postgres
Try querying LightRAG again via API or UI
Expected Behavior
- When PostgreSQL restarts, LightRAG should retry DB connections
- It should detect stale connections and rebuild the connection pool
- /health endpoint should return UNHEALTHY when DB is unreachable
- LightRAG should survive DB failover without restarting the container manually
LightRAG Config Used
Paste your config here
LightRAG Test Configuration
HOST=0.0.0.0 PORT=9621
Workspace
WORKSPACE=testworkspace
LLM (not relevant to DB issue, but included)
LLM_BINDING=ollama LLM_MODEL=deepseek-v3.1:671b-cloud LLM_BINDING_HOST=https://llm.test.com LLM_BINDING_API_KEY=dummy-key OLLAMA_LLM_NUM_CTX=32768
Embedding
EMBEDDING_BINDING=ollama EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=https://llm.test.com EMBEDDING_BINDING_API_KEY=dummy-key EMBEDDING_TIMEOUT=600
Storage (PostgreSQL)
LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_GRAPH_STORAGE=PGGraphStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage
PostgreSQL Settings
POSTGRES_HOST=postgres POSTGRES_PORT=5432 POSTGRES_USER=testuser POSTGRES_PASSWORD=testpassword POSTGRES_DATABASE=testdb POSTGRES_MAX_CONNECTIONS=10 POSTGRES_WORKSPACE=testworkspace
PostgreSQL Vector Index
POSTGRES_VECTOR_INDEX_TYPE=HNSW POSTGRES_HNSW_M=16 POSTGRES_HNSW_EF=200
Logging
LOG_LEVEL=INFO
Performance / Timeouts
HTTPX_TIMEOUT=300 LLM_TIMEOUT=300 MAX_ASYNC=4 MAX_PARALLEL_INSERT=2
Other optional config
ENABLE_LLM_CACHE=true ENABLE_LLM_CACHE_FOR_EXTRACT=true SUMMARY_LANGUAGE=English
Logs and screenshots
Additional Information
- LightRAG Version:
- Operating System:
- Python Version:
- Related Issues: