[Bug]:Embedding fails with openai.InternalServerError + Ollama v0.13.x breaks embedding integration (works in v0.12.11)

Open akash171198 opened this issue 2 weeks ago • 1 comments

Do you need to file an issue?

[x] I have searched the existing issues and this bug is not already filed.
[x] I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

LightRAG fails during document processing when using an OpenAI-compatible API for LLM and an Ollama server for embeddings.

Two separate issues happen:

Embedding Failure Using OpenAI-Compatible Provider Even though my .env is configured with:

EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=http://<hostedOllama>:11434/v1/

LightRAG throws: openai.InternalServerError: Error code: 500 {'error': {'message': 'connection refused', 'type': 'api_error'}} Even though the endpoint is accessible and curl/Postman work.

Ollama v0.13.x Embedding Regression

When embedding requests hit Ollama v0.13.x, LightRAG receives: do embedding request: Post "http://127.0.0.1:/embedding": EOF Ollama 0.13.x logs repeated crashes: panic: caching disabled but unable to fit entire input in a batch runner.go:707 starting runner ... starting runner ...

The SAME configuration works perfectly with Ollama v0.12.11, but breaks on v0.13.0 – v0.13.2.

Steps to reproduce

Deploy LightRAG (latest Docker image) with this .env: EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_BINDING_HOST=http://<hostedOllama>:11434/v1/ OLLAMA_EMBEDDING_NUM_CTX=8192

LLM_BINDING=openai LLM_MODEL=cogito-2.1:671b-cloud LLM_BINDING_HOST=https://<cloudLLMHost>/v1/ LLM_BINDING_API_KEY=sk_live_xxxxx

Run Ollama with embedding models loaded: ollama pull nomic-embed-text ollama serve

Upload any document (JSON/PDF) through LightRAG UI.

Processing fails during entity embed step with openai.InternalServerError.

If Ollama 0.13.x is used → embedding runner continuously crashes. If Ollama 0.12.11 is used → everything works.

Expected Behavior

LightRAG should successfully call embedding API through OpenAI-compatible wrapper

Embeddings should process without InternalServerError
Ollama embedding runner should not crash
Upgrading Ollama should not break LightRAG
Document ingestion should complete normally
Behavior should match v0.12.11

LightRAG Config Used

HOST=0.0.0.0 PORT=9621 WEBUI_TITLE='G99 Rag7' WEBUI_DESCRIPTION="Simple and Fast Graph Based RAG System" TIMEOUT=640 WORKER_TIMEOUT=640

AUTH_ACCOUNTS='admin:Admin@123,akash:Akash@123' TOKEN_SECRET=dummy_token_secret_123

ENABLE_LLM_CACHE=true

RERANK_BINDING=null

ENABLE_LLM_CACHE_FOR_EXTRACT=true SUMMARY_LANGUAGE=English

MAX_ASYNC=4 MAX_PARALLEL_INSERT=4 EMBEDDING_BATCH_NUM=10

LLM_TIMEOUT=640 LLM_BINDING=openai LLM_MODEL=deepseek-v3.1:671b-cloud LLM_BINDING_HOST=http://:11434/v1/ LLM_BINDING_API_KEY=sk_dummy_llm_key_123

OLLAMA_LLM_NUM_CTX=32768

EMBEDDING_TIMEOUT=640 EMBEDDING_BINDING=openai EMBEDDING_MODEL=nomic-embed-text:latest EMBEDDING_DIM=768 EMBEDDING_BINDING_API_KEY=sk_k*** EMBEDDING_BINDING_HOST=http://:11434/v1/ OLLAMA_EMBEDDING_NUM_CTX=8192

WORKSPACE=business7

HTTPX_TIMEOUT=640

LIGHTRAG_GRAPH_STORAGE=Neo4JStorage

LIGHTRAG_KV_STORAGE=PGKVStorage LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage LIGHTRAG_VECTOR_STORAGE=PGVectorStorage

POSTGRES_HOST=192.168.31.228 POSTGRES_PORT=5432 POSTGRES_USER=postgres POSTGRES_PASSWORD=postgres POSTGRES_DATABASE=postgres POSTGRES_MAX_CONNECTIONS=12 POSTGRES_WORKSPACE=business7 POSTGRES_VECTOR_INDEX_TYPE=HNSW POSTGRES_HNSW_M=16 POSTGRES_HNSW_EF=200 POSTGRES_IVFFLAT_LISTS=100 POSTGRES_CONNECTION_RETRIES=3 POSTGRES_CONNECTION_RETRY_BACKOFF=0.5 POSTGRES_CONNECTION_RETRY_BACKOFF_MAX=5.0 POSTGRES_POOL_CLOSE_TIMEOUT=5.0

NEO4J_URI=bolt://192.168.31.228:7687 NEO4J_USERNAME=neo4j NEO4J_PASSWORD=neo_pass_123 NEO4J_DATABASE=neo4j NEO4J_MAX_CONNECTION_POOL_SIZE=100 NEO4J_CONNECTION_TIMEOUT=30 NEO4J_CONNECTION_ACQUISITION_TIMEOUT=30 NEO4J_MAX_TRANSACTION_RETRY_TIME=30 NEO4J_MAX_CONNECTION_LIFETIME=300 NEO4J_LIVENESS_CHECK_TIMEOUT=30 NEO4J_KEEP_ALIVE=true NEO4J_WORKSPACE=business7

MONGO_URI=mongodb://root:root@localhost:27017/ MONGO_DATABASE=LightRAG

MILVUS_URI=http://localhost:19530 MILVUS_DB_NAME=lightrag

QDRANT_URL=http://localhost:6333

REDIS_URI=redis://localhost:6379 REDIS_SOCKET_TIMEOUT=30 REDIS_CONNECT_TIMEOUT=10 REDIS_MAX_CONNECTIONS=100 REDIS_RETRY_ATTEMPTS=3

MEMGRAPH_URI=bolt://localhost:7687 MEMGRAPH_DATABASE=memgraph

Logs and screenshots

Lighrag logs :- File "/app/lightrag/utils.py", line 358, in call return await self.func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 189, in async_wrapped return await copy(fn, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 111, in call do = await self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 153, in iter result = await action(retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/_utils.py", line 99, in inner return call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/tenacity/init.py", line 400, in self._add_action_func(lambda rs: rs.outcome.result()) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/app/.venv/lib/python3.12/site-packages/tenacity/asyncio/init.py", line 114, in call result = await fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/lightrag/llm/openai.py", line 641, in openai_embed response = await openai_async_client.embeddings.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/resources/embeddings.py", line 251, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/.venv/lib/python3.12/site-packages/openai/_base_client.py", line 1594, in request raise self._make_status_error_from_response(err.response) from None openai.InternalServerError: Error code: 500 - {'error': {'message': 'connection refused', 'type': 'api_error', 'param': None, 'code': None}} Failed to extract document 3/15: contact_conversation_401118.json

Additional Information

LightRAG Docker Image: latest (v1.4.9.8) Backend: Python inside LightRAG image v1.4.9.8 Neo4j Graph Storage: operational Running inside Amazon ECS (Linux, AMD64)

Dec 11 '25 05:12 akash171198

If LightRAG Server is deployed in Docker, uses host.docker.internal instead of localhost in EMBEDDING_BINDING_HOST.

EMBEDDING_BINDING_HOST=http://host.docker.internal:11434

Dec 12 '25 02:12 danielaskdd