LightRAG icon indicating copy to clipboard operation
LightRAG copied to clipboard

[Bug]:Embedding fails with openai.InternalServerError + Ollama v0.13.x breaks embedding integration (works in v0.12.11)

Open akash171198 opened this issue 2 weeks ago • 1 comments

Do you need to file an issue?

  • [x] I have searched the existing issues and this bug is not already filed.
  • [x] I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

LightRAG fails during document processing when using an OpenAI-compatible API for LLM and an Ollama server for embeddings.

Two separate issues happen:

Embedding Failure Using OpenAI-Compatible Provider Even though my .env is configured with:

EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=http://<hostedOllama>:11434/v1/

LightRAG throws: openai.InternalServerError: Error code: 500 {'error': {'message': 'connection refused', 'type': 'api_error'}} Even though the endpoint is accessible and curl/Postman work.

Ollama v0.13.x Embedding Regression

When embedding requests hit Ollama v0.13.x, LightRAG receives: do embedding request: Post "http://127.0.0.1:/embedding": EOF Ollama 0.13.x logs repeated crashes: panic: caching disabled but unable to fit entire input in a batch runner.go:707 starting runner ... starting runner ...

The SAME configuration works perfectly with Ollama v0.12.11, but breaks on v0.13.0 – v0.13.2.

Steps to reproduce

Deploy LightRAG (latest Docker image) with this .env: EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=http://<hostedOllama>:11434/v1/ OLLAMA_EMBEDDING_NUM_CTX=8192

LLM_BINDING=openai LLM_MODEL=cogito-2.1:671b-cloud LLM_BINDING_HOST=https://<cloudLLMHost>/v1/ LLM_BINDING_API_KEY=sk_live_xxxxx

Run Ollama with embedding models loaded: ollama pull nomic-embed-text ollama serve

Upload any document (JSON/PDF) through LightRAG UI.

Processing fails during entity embed step with openai.InternalServerError.

If Ollama 0.13.x is used → embedding runner continuously crashes. If Ollama 0.12.11 is used → everything works.

Expected Behavior

LightRAG should successfully call embedding API through OpenAI-compatible wrapper

  • Embeddings should process without InternalServerError
  • Ollama embedding runner should not crash
  • Upgrading Ollama should not break LightRAG
  • Document ingestion should complete normally
  • Behavior should match v0.12.11

LightRAG Config Used

HOST=0.0.0.0 PORT=9621 WEBUI_TITLE='G99 Rag7' WEBUI_DESCRIPTION="Simple and Fast Graph Based RAG System" TIMEOUT=640 WORKER_TIMEOUT=640

AUTH_ACCOUNTS='admin:Admin@123,akash:Akash@123' TOKEN_SECRET=dummy_token_secret_123

ENABLE_LLM_CACHE=true

RERANK_BINDING=null

ENABLE_LLM_CACHE_FOR_EXTRACT=true SUMMARY_LANGUAGE=English

MAX_ASYNC=4 MAX_PARALLEL_INSERT=4 EMBEDDING_BATCH_NUM=10

LLM_TIMEOUT=640 LLM_BINDING=openai LLM_MODEL=deepseek-v3.1:671b-cloud LLM_BINDING_HOST=http://:11434/v1/ LLM_BINDING_API_KEY=sk_dummy_llm_key_123

OLLAMA_LLM_NUM_CTX=32768

EMBEDDING_TIMEOUT=640 EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_DIM=768 EMBEDDING_BINDING_API_KEY=sk_k*** EMBEDDING_BINDING_HOST=http://:11434/v1/ OLLAMA_EMBEDDING_NUM_CTX=8192

WORKSPACE=business7

HTTPX_TIMEOUT=640

LIGHTRAG_GRAPH_STORAGE=Neo4JStorage

LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage

POSTGRES_HOST=192.168.31.228 POSTGRES_PORT=5432 POSTGRES_USER=postgres POSTGRES_PASSWORD=postgres POSTGRES_DATABASE=postgres POSTGRES_MAX_CONNECTIONS=12 POSTGRES_WORKSPACE=business7 POSTGRES_VECTOR_INDEX_TYPE=HNSW POSTGRES_HNSW_M=16 POSTGRES_HNSW_EF=200 POSTGRES_IVFFLAT_LISTS=100 POSTGRES_CONNECTION_RETRIES=3 POSTGRES_CONNECTION_RETRY_BACKOFF=0.5 POSTGRES_CONNECTION_RETRY_BACKOFF_MAX=5.0 POSTGRES_POOL_CLOSE_TIMEOUT=5.0

NEO4J_URI=bolt://192.168.31.228:7687 NEO4J_USERNAME=neo4j NEO4J_PASSWORD=neo_pass_123 NEO4J_DATABASE=neo4j NEO4J_MAX_CONNECTION_POOL_SIZE=100 NEO4J_CONNECTION_TIMEOUT=30 NEO4J_CONNECTION_ACQUISITION_TIMEOUT=30 NEO4J_MAX_TRANSACTION_RETRY_TIME=30 NEO4J_MAX_CONNECTION_LIFETIME=300 NEO4J_LIVENESS_CHECK_TIMEOUT=30 NEO4J_KEEP_ALIVE=true NEO4J_WORKSPACE=business7

MONGO_URI=mongodb://root:root@localhost:27017/ MONGO_DATABASE=LightRAG

MILVUS_URI=http://localhost:19530 MILVUS_DB_NAME=lightrag

QDRANT_URL=http://localhost:6333

REDIS_URI=redis://localhost:6379 REDIS_SOCKET_TIMEOUT=30 REDIS_CONNECT_TIMEOUT=10 REDIS_MAX_CONNECTIONS=100 REDIS_RETRY_ATTEMPTS=3

MEMGRAPH_URI=bolt://localhost:7687 MEMGRAPH_DATABASE=memgraph

Logs and screenshots

Lighrag logs :- File "/app/lightrag/utils.py", line 358, in call return await self.func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped return await copy(fn, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 111, in call do = await self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 153, in iter result = await action(retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/_utils.py", line 99, in inner return call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/init.py", line 400, in self._add_action_func(lambda rs: rs.outcome.result()) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 114, in call result = await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/lightrag/llm/openai.py", line 641, in openai_embed response = await openai_async_client.embeddings.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 251, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': {'message': 'connection refused', 'type': 'api_error', 'param': None, 'code': None}} Failed to extract document 3/15: contact_conversation_401118.json

Additional Information

LightRAG Docker Image: latest (v1.4.9.8) Backend: Python inside LightRAG image v1.4.9.8 Neo4j Graph Storage: operational Running inside Amazon ECS (Linux, AMD64)

akash171198 avatar Dec 11 '25 05:12 akash171198

If LightRAG Server is deployed in Docker, uses host.docker.internal instead of localhost in EMBEDDING_BINDING_HOST.

EMBEDDING_BINDING_HOST=http://host.docker.internal:11434

danielaskdd avatar Dec 12 '25 02:12 danielaskdd