Improvement Request: Improve error information while calling Embedder
Context: I am right now using a Ollama gemma:7b model, which seems to be properly working for simple tasks, but, as you mention in https://help.getzep.com/graphiti/graphiti/installation, some models may not be returning answers in the shape it is needed. This may be the problem I am, but, I am not sure, because the error log is not precise.
My code gets the text from a PDF file, and add it as an episode.
# Create episode in Graphiti with document content and metadata
await graphiti.add_episode(
name=file_path.name,
episode_body=text,
reference_time=datetime.now(timezone.utc),
source=EpisodeType.text,
source_description=f"Local document: {file_path.name}",
group_id="local",
entity_types=resolution_entity_types
)
This is my Grphiti config
return {
"llm_client": OpenAIGenericClient(
config=LLMConfig(
base_url="http://xxx.xxx.xxx.xxx:11434/v1",
model="gemma:7b",
small_model="gemma:7b",
)
),
"embedder": OpenAIEmbedder(
config=OpenAIEmbedderConfig(
embedding_model="text-embedding-3-small"
),
client=openai_client
)
}
Problem is, the chat/completion seems to be working well (probably just not answering with the correct data), but the /embeddings calls is returning 400 bad request.
Check the logs. I've removed some IP where I have deployed my Ollama LLM.
➜ archivosya-graphiti git:(main) ✗ uv run src/local.py
2025-05-13 15:18:54 - __main__ - INFO - Found 1 files to process
2025-05-13 15:18:54 - __main__ - INFO - Processing /Users/noel/ArchivosYa/archivosya-graphiti/data/test-file.pdf
2025-05-13 15:27:08 - httpx - INFO - HTTP Request: POST http://xx.xxx.xxx.xxx:11434/v1/chat/completions "HTTP/1.1 200 OK"
2025-05-13 15:27:09 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 400 Bad Request"
2025-05-13 15:27:09 - __main__ - INFO - Neo4j connection closed
Traceback (most recent call last):
File "/Users/noel/ArchivosYa/archivosya-graphiti/src/local.py", line 127, in <module>
asyncio.run(process_local_files())
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/src/local.py", line 109, in process_local_files
await graphiti.add_episode(
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/graphiti_core/graphiti.py", line 416, in add_episode
raise e
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/graphiti_core/graphiti.py", line 361, in add_episode
extracted_nodes = await extract_nodes(
^^^^^^^^^^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/graphiti_core/utils/maintenance/node_operations.py", line 168, in extract_nodes
await create_entity_node_embeddings(embedder, extracted_nodes)
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/graphiti_core/nodes.py", line 566, in create_entity_node_embeddings
name_embeddings = await embedder.create_batch([node.name for node in nodes])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/graphiti_core/embedder/openai.py", line 63, in create_batch
result = await self.client.embeddings.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/openai/resources/embeddings.py", line 128, in create
return self._post(
^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1239, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/noel/ArchivosYa/archivosya-graphiti/.venv/lib/python3.11/site-packages/openai/_base_client.py", line 1034, in request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'$.input' is invalid. Please check the API reference: https://platform.openai.com/docs/api-reference.", 'type': 'invalid_request_error', 'param': None, 'code': None}}
I wonder if you could add (or improve) some structure detection in the /chat/completion response, so you throw an error explaining what happening, like: "The response from LLM is not correct. Expected X, and got Y", so then I can focus on fixing/changing the LLM model.
But right now, I don't know what could be the issue. I don't know how to proceed.
I have a suspicion that whatever node names that are being generated by the Ollama model are not being accepted by the OpenAI API. These should be non-empty strings, but are likely null or empty strings. I think you're going to struggle to get Graphiti to work with smaller, local models.
Well... Have you tried with any local model to get it working? Something outside Ollama?
If there is no way to get Graphiti working with free models, that's a big limitation.
In my case I have to process around 2 million PDF files with 130 pages each. That would cost an astronomical amount of money on paid models.
@danielchalef is there any update on this?
Could you please point which ollama-free-local-model could work, so I can test it out?
I'm considering creating a DigitalOcean server with an Ollama free local model and use it with Graphiti, but all the ones I tried are failling.
If you know some model that works, please let me know.
@BrodaNoel, I just created a pull request to add documentation https://github.com/getzep/graphiti/pull/601. Using deepseek-r1:7b with Ollama worked for me as it was roughly the same size as 4.1-mini and gave reliable structured output.
@thorchh AWESOME! Do you know how much ram and GPU is recommended for that model? I'll run it in a DigitalOcean virtual machine.
I'll try it soon! Thanks!
It worked pretty well for me on an old 16 GB M1 MacBook (Not instant but reasonable for a local model)
this source says: Small models (7B) Minimum RAM: 8GB Recommended RAM: 16GB
https://www.byteplus.com/en/topic/405436?title=how-much-memory-does-ollama-need
@BrodaNoel Is this still relevant? Please confirm within 14 days or this issue will be closed.
I tested it using the Ollama endpoint with the deepseek-r1:7b model and the LM Studio endpoint with the text-embedding-nomic-embed-text-v1.5 model and both worked successfully for me.
llm_config = LLMConfig(
api_key="ollama",
model="deepseek-r1:7b",
small_model="deepseek-r1:7b",
base_url="http://localhost:11434/v1", # Ollama's OpenAI-compatible endpoint
)
embedder=OpenAIEmbedder(
config=OpenAIEmbedderConfig(
api_key="lmstudio",
embedding_model="text-embedding-nomic-embed-text-v1.5",
embedding_dim=768,
base_url="http://localhost:1234/v1", # LM studio's OpenAI-compatible endpoint
)
),
@BrodaNoel Is this still an issue? Please confirm within 14 days or this issue will be closed.
@BrodaNoel Is this still relevant? Please confirm within 14 days or this issue will be closed.