LightRAG icon indicating copy to clipboard operation
LightRAG copied to clipboard

[Bug]:PostgreSQL restart breaks LightRAG permanently — no auto-reconnect & /health still returns OK

Open akash171198 opened this issue 1 month ago • 1 comments

Do you need to file an issue?

  • [x] I have searched the existing issues and this bug is not already filed.
  • [x] I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

When using LightRAG with PostgreSQL storage (PGKVStorage, PGGraphStorage, PGVectorStorage, PGDocStatusStorage), the application does not recover after PostgreSQL restarts.

  • The LightRAG container continues running
  • /health endpoint still returns 200
  • But every RAG operation fails internally because the PostgreSQL connection pool is stale

The only way to recover is to restart/redeploy the LightRAG container manually.

This makes LightRAG not resilient to database restarts or failovers.

Steps to reproduce

Deploy LightRAG using the official image:

ghcr.io/hkuds/lightrag:latest

Configure PostgreSQL using .env:

LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_GRAPH_STORAGE=PGGraphStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage

Start LightRAG (Docker, ECS, or Compose)

Restart PostgreSQL:

docker restart postgres

Try querying LightRAG again via API or UI

Expected Behavior

  • When PostgreSQL restarts, LightRAG should retry DB connections
  • It should detect stale connections and rebuild the connection pool
  • /health endpoint should return UNHEALTHY when DB is unreachable
  • LightRAG should survive DB failover without restarting the container manually

LightRAG Config Used

Paste your config here

LightRAG Test Configuration

HOST=0.0.0.0 PORT=9621

Workspace

WORKSPACE=testworkspace

LLM (not relevant to DB issue, but included)

LLM_BINDING=ollama LLM_MODEL=deepseek-v3.1:671b-cloud LLM_BINDING_HOST=https://llm.test.com LLM_BINDING_API_KEY=dummy-key OLLAMA_LLM_NUM_CTX=32768

Embedding

EMBEDDING_BINDING=ollama EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=https://llm.test.com EMBEDDING_BINDING_API_KEY=dummy-key EMBEDDING_TIMEOUT=600

Storage (PostgreSQL)

LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_GRAPH_STORAGE=PGGraphStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage

PostgreSQL Settings

POSTGRES_HOST=postgres POSTGRES_PORT=5432 POSTGRES_USER=testuser POSTGRES_PASSWORD=testpassword POSTGRES_DATABASE=testdb POSTGRES_MAX_CONNECTIONS=10 POSTGRES_WORKSPACE=testworkspace

PostgreSQL Vector Index

POSTGRES_VECTOR_INDEX_TYPE=HNSW POSTGRES_HNSW_M=16 POSTGRES_HNSW_EF=200

Logging

LOG_LEVEL=INFO

Performance / Timeouts

HTTPX_TIMEOUT=300 LLM_TIMEOUT=300 MAX_ASYNC=4 MAX_PARALLEL_INSERT=2

Other optional config

ENABLE_LLM_CACHE=true ENABLE_LLM_CACHE_FOR_EXTRACT=true SUMMARY_LANGUAGE=English

Logs and screenshots

Image

Additional Information

  • LightRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:

akash171198 avatar Nov 13 '25 17:11 akash171198