`ollama.stop` option
Any plans to add ollama.stop as a function similar to the CLI to stop running models?
Do you mean aborting?
I thought that only aborts generation but doesn't completely stop the model?
Does it function the same as ollama stop {model-name}?
Oh, I didn't know about ollama stop, I thought you meant stopping a generation. Looking at the source code for the Ollama API (I don't know Go, so I'm making an educated guess), it unloads a model by requesting it to generate from a null prompt (the same way you can pre-load a model). This method works for pre-loading models via the API, so presumably it'll also work for unloading them (even though it appears to be undocumented for the API). Sadly, this is not type-safe in ollama-js, but I have brought this up in https://github.com/ollama/ollama-js/issues/162
I was looking for that so thank you, I'm sending a null prompt as well to warm-up the model so it's ready to go instead of loading it on the first actual prompt.
A way to unload would be great. stop does that in the CLI. I'm using a child process at the moment to execute the stop command from inside JS which isn't ideal. Gets complicated if you're in a separate Docker container.
Running ollama stop model-name makes this generate request:
{
"model": "model-name",
"prompt": "",
"suffix": "",
"system": "",
"template": "",
"keep_alive": "0s",
"options": null
}
Therefore, you can stop a model using existing ollama-js functions:
import ollama from "ollama";
async function stop({ model: string, pollInterval = 500, client = ollama }) {
if (!model.includes(":")) {
// There may be edge cases where this doesn't work
model += ":latest";
}
async function isModelRunning() {
const response = await client.ps();
const models = response.models.map(model => model.name);
return models.includes(model);
}
await client.generate({
model,
// Some of these might be unnecessary
prompt: "",
suffix: "",
template: "",
keep_alive: "0s",
options: null,
});
while (await isModelRunning()) {
await new Promise(resolve => setTimeout(resolve, pollInterval));
}
}
The polling is only necessary if you want to wait until the model is done stopping.
Awesome, thank you. So just generate same model name with a 0s keep alive?