Can we get an ollama.stop?
ollama stop was a great addition. I was hoping we could get stop added to the API library.
Thanks
I second this as well. And to keep things clear. I'm talking about making available in the ollama python lib the same behavior you get when you run ollama stop <target_model> from terminal. Which is what I believe what JTMarsh556 is also asking.
+1 I'm writing a script and want to be able to call it with different models as a parameter, but I need to be able to have the script stop the model it's using at the end of it's execution, that way the way is clear for if I call with a different model.
The generate and chat methods both accept a float for keep-alive. When hot-switching models, I set my keep-alive=0 at the call and it should unload immediately (at least that's been my observation).
Great - thanks for bringing - will scope in :)
Also, as I understand, there can be multiple generations at once through AsyncClient. So better would be adding stop function as returning handle from client.chat, client.generate and etc.
I'd like to see both ollama force-quit to immediate kill all running models no matter what, regardless of any work already done; and ollama drain to allow whatever requests are in progress to finish while refusing any new ones - even on open connections.
Any updates on this? or any alternatives to the abort the generation programmatically (kill specific generation or model)
Was this function implemented? I dont see it in the docs