cognee Maximum context length exception when running run_code_graph

When running an example in examples/python/code_graph_example.py I get the following error:

instructor.exceptions.InstructorRetryException: litellm.ContextWindowExceededError: litellm.BadRequestError: ContextWindowExceededError: OpenAIException - This model's maximum context length is 128000 tokens. However, your messages resulted in 147379 tokens (147214 in the messages, 165 in the functions). Please reduce the length of the messages or functions.

Somewhere in the pipeline cognee is passing really big chunks to the LLM and I cannot find how to adjust the maximum chunk size. The recommendation to "Please reduce the length of the messages or functions." is not helpful if I am trying to create a code assistant from somebody else's repo. In this example I am actually trying to congify the cognee repo itself:

python code_graph_example.py --repo_path ~/cognee

There is another example in the documentation on code cognification but it is outdated: notebooks/cognee_code_graph_demo.ipynb

There currently does not appear to be a working example on cognifying a large repo. I even have tried mcp but "codify" method just returns json parsing errors over and over.

Mar 22 '25 15:03 vsantalov

Can you try running the embedding with these options in the .env file?

EMBEDDING_PROVIDER="fastembed" EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" EMBEDDING_DIMENSIONS=384 EMBEDDING_MAX_TOKENS=256

This is a multiprocess local embedding engine that should resolve your issues with chunking in the code graph

Apr 02 '25 22:04 dexters1

Should be resolved with embedding splitting and smaller batch size handling. Out in the next release

Jun 10 '25 12:06 Vasilije1990

Maximum context length exception when running run_code_graph_pipeline