R2R icon indicating copy to clipboard operation
R2R copied to clipboard

Support setting context window size for Ollama models (num_ctx)

Open swys opened this issue 9 months ago • 2 comments
trafficstars

Is your feature request related to a problem? Please describe. When uploading a document I am seeing below warning in Ollama logs:

[GIN] 2025/02/16 - 16:26:54 | 200 |     10.7681ms |      10.20.1.73 | POST     "/api/show"
time=2025-02-16T16:27:30.444-05:00 level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=2363 keep=5 new=2048
[GIN] 2025/02/16 - 16:27:34 | 200 |    4.2324405s |      10.20.1.73 | POST     "/api/generate"
[GIN] 2025/02/16 - 16:27:34 | 200 |     10.9234ms |      10.20.1.73 | POST     "/api/show"
[GIN] 2025/02/16 - 16:27:34 | 200 |     11.0538ms |      10.20.1.73 | POST     "/api/show"

The prompt size exceeds the default 2048 so the input is truncated. Would like the ability to set the size of the context window to prevent truncation.

I was not able to find an existing config that controls this setting. Please let me know if I missed something in the docs.

While searching through issues in this repo I came across this PR: https://github.com/SciPhi-AI/R2R/pull/1033 which mentions the feature I am after, however it looks like it was closed.

Describe the solution you'd like Would nice to have additional config key like num_ctx (same as what ollama expects when you create a client) in the config. That way users have the ability to make adjustments to the config as needed.

When I create a client for ollama I use below example snippet:

llm = ChatOllama(
  model=ollama_model,
  temperature=ollama_temp,
  base_url=ollama_base_url,
  num_ctx=ollama_context_size
)

to set the context size via the num_ctx key. This works as expected and I am able to increase the context window beyond the default 2048.

Describe alternatives you've considered N/A

Additional context N/A

swys avatar Feb 16 '25 21:02 swys