ollama icon indicating copy to clipboard operation
ollama copied to clipboard

Add option to ignore 'keep_alive' in request body

Open sjoerdvandenbos-prodrive opened this issue 9 months ago • 0 comments

When hosting this in prod we would like our users to not be able to unload the model from the GPU. Currently whenever users use the continue extension to communicate with the model they reset the keep_alive to 5 minutes.

It would be nice to have an environment variable like IGNORE_KEEP_ALIVE_REQUESTS=1 so that we can set OLLAMA_KEEP_ALIVE=-1 in the container.