Ollama keep_alive is not working
Hi,
When using ollama and passing in "keep_alive" as a "language_model_params", the model is loaded with the default keep_alive of 5 minutes.
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
language_model_type=lx.inference.OllamaLanguageModel,
model_id="qwen2.5:14b",
model_url=os.getenv("OLLAMA_HOST", "http://localhost:11434"),
temperature=0.3,
fence_output=False,
use_schema_constraints=False,
max_char_buffer=5000,
language_model_params={
"num_ctx": 8192,
"keep_alive": 10*60, # 10 minutes
"timeout": 10*60 # 10 minutes
}
)
You can run the following to verify (assuming the model wasn't in memory already), it will be loaded for 5 minutes.
ollama ps
In the Ollama.py file, it looks like keep_alive is put under the "options" parameter, but the Ollama API documentation shows that it is one of the top level parameters so the payload should be:
payload: dict[str, Any] = {
'model': model,
'prompt': prompt,
'system': system,
'stream': False,
'raw': raw,
'keep_alive': keep_alive,
'options': options,
}
Did you figure this out?
There's a PR to fix this. In the meantime, if you need to have it resolved, you can always edit the Ollama.py file for LangExtract in your pip installation folder and look at the PR to add it to the top level params.