ollama-python Can we get an ollama.stop?

ollama stop was a great addition. I was hoping we could get stop added to the API library.

Thanks

Sep 22 '24 22:09 JTMarsh556

I second this as well. And to keep things clear. I'm talking about making available in the ollama python lib the same behavior you get when you run ollama stop <target_model> from terminal. Which is what I believe what JTMarsh556 is also asking.

Oct 01 '24 20:10 davidearlyoung

+1 I'm writing a script and want to be able to call it with different models as a parameter, but I need to be able to have the script stop the model it's using at the end of it's execution, that way the way is clear for if I call with a different model.

Oct 03 '24 21:10 jcgordon10

The generate and chat methods both accept a float for keep-alive. When hot-switching models, I set my keep-alive=0 at the call and it should unload immediately (at least that's been my observation).

Nov 07 '24 22:11 wallscreet

Great - thanks for bringing - will scope in :)

Nov 26 '24 00:11 ParthSareen

Also, as I understand, there can be multiple generations at once through AsyncClient. So better would be adding stop function as returning handle from client.chat, client.generate and etc.

Dec 18 '24 07:12 liponex

I'd like to see both ollama force-quit to immediate kill all running models no matter what, regardless of any work already done; and ollama drain to allow whatever requests are in progress to finish while refusing any new ones - even on open connections.

Apr 13 '25 03:04 ckuethe

Any updates on this? or any alternatives to the abort the generation programmatically (kill specific generation or model)

May 22 '25 07:05 MERakram

Was this function implemented? I dont see it in the docs

Aug 22 '25 07:08 andrea-lorenzetti