graphiti Docs: Missing documentation about how to implement `OpenAIGenericClient`

Following this discussion: https://github.com/getzep/graphiti/issues/480, and reading this documentation: https://help.getzep.com/graphiti/graphiti/installation

I'm trying to implement a Ollama gemma:7b in order to replace the expensive OpenAI models.

Point is, there is no documentation/example about how to implement it.

I currently have this code to implement OpenAI:

def get_graphiti_llm_config_openai():
    """Get the OpenAI LLM and embedder configuration for Graphiti."""
    openai_client = OpenAI(
        api_key=os.getenv("OPENAI_API_KEY")
    )
    
    return {
        "llm_client": OpenAIClient(
            client=openai_client,
            model="gpt-4.1-nano"
        ),
        "embedder": OpenAIEmbedder(
            config=OpenAIEmbedderConfig(
                embedding_model="text-embedding-3-small"
            ),
            client=openai_client
        )
    }

And I have a server running Ollama gemma:7b at http://xxx.xxx.xxx.xxx:11434

Do you have any idea how to get this working?

I'll add it to the documentation if you provide me an answer and I get it working.

May 13 '25 15:05 BrodaNoel

I think I got it working with this code:

import os
from openai import OpenAI
from graphiti_core import Graphiti
from graphiti_core.embedder.openai import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.llm_client import LLMConfig
from graphiti_core.llm_client.openai_generic_client import OpenAIGenericClient

def test():
    openai_client = OpenAI(
        api_key=os.getenv("OPENAI_API_KEY")
    )

    graphiti = Graphiti(
        NEO4J_URI, 
        NEO4J_USER, 
        NEO4J_PASSWORD,
        llm_client=OpenAIGenericClient(
            config=LLMConfig(
                base_url="http://xxx.xxx.xxx.xxx:11434/v1",
                model="gemma:7b",
                small_model="gemma:7b",
            )
        ),
        embedder=OpenAIEmbedder(
            config=OpenAIEmbedderConfig(
                embedding_model="text-embedding-3-small"
            ),
            client=openai_client
        )
    )

If you agree, I can add it to the documentation, at the bottom of https://help.getzep.com/graphiti/graphiti/installation

May 13 '25 16:05 BrodaNoel

I can also provide the examples about how to run a local model:

# install ollama
curl -fsSL https://ollama.com/install.sh | sh

# restart shell
exec $SHELL

# run ollama gemma:7b model
ollama run gemma:7b

May 13 '25 16:05 BrodaNoel

openai_client = OpenAI( api_key=os.getenv("OPENAI_API_KEY") ) It is OK to replace it with OpenAI compatible LLM.

How to replace embedder=OpenAIEmbedder( config=OpenAIEmbedderConfig( embedding_model="text-embedding-3-small" ), with either ollama local models or vLLM local models

May 21 '25 16:05 look4pritam

I already implemented it, but the implementation is not working, as you can read in https://github.com/getzep/graphiti/issues/485

So, before confirming this is the proper way, I would like to see it working

May 21 '25 16:05 BrodaNoel

The above is correct, with the caveat I mentioned in #485. We'd appreciate a PR adding the above to the README.

May 22 '25 04:05 danielchalef

I will try with vLLM and Qwen series model. I have done it successfully for other projects. API looks compatible with vLLM for LLM. But adding local embedding model will require efforts. I cannot configure it with API based proprietary models. I need to run it locally. If I am able to get it done then I will raise PR. It will take some time.

May 22 '25 16:05 look4pritam

I am not that interested in getting the local embedding working because they are quite cheap in general.

Problem is the LLM. That is expensive and it's kind of really necessary to replace them with local versions.

Could you get a local model working with Graphiti? If so, let me know which model, so I can try. Then I can also add a PR

May 22 '25 17:05 BrodaNoel

Has anybody found a free model (Ollama?) that works with this?

Jun 13 '25 22:06 BrodaNoel

@BrodaNoel, I just created a pull request to add documentation https://github.com/getzep/graphiti/pull/601. Using deepseek-r1:7b with Ollama worked for me as it was roughly the same size as 4.1-mini and gave reliable structured output.

Jun 18 '25 07:06 thorchh

For full local models, I found out using bge-m3 as embedding model for chinese context with command-r7b as LLM maybe perform better then deekseek-r1:7b, sometimes deekseek-r1:7b will stuck at loop if episode_body is large like adding a summary of project description to memory.

Jul 12 '25 05:07 brianchul

@BrodaNoel Is this still relevant? Please confirm within 14 days or this issue will be closed.

Oct 05 '25 00:10 claude[bot]

@BrodaNoel Is this still an issue? Please confirm within 14 days or this issue will be closed.

Oct 29 '25 00:10 claude[bot]

@BrodaNoel Is this still relevant? Please confirm within 14 days or this issue will be closed.

Nov 17 '25 00:11 claude[bot]