Patrick Devine

Results 323 comments of Patrick Devine

I'm going to go ahead and close the issue. Please feel free to comment and we can reopen if you're still seeing the issue.

There shouldn't be a case where the model is listed but can't be run. The `list` command in the client will fetch the list of models that it can run...

@springsuu what version of Windows are you running? Are you up-to-date with patches?

@hekmon what version of Linux are you using and are you using this in the cloud somewhere? I just want to see if I can duplicate the issue.

> Edit: just saw your previous comments where you told you tested it on Ubuntu 22.04 and 2xA100. That's a bummer. May be on on the model side then ?...

OK, this turns out to be an an nVidia problem where they updated the driver and it no longer loads the correct kernel modules. #4652 fixes this, but the workaround...

Note, this is specifically w/ driver version `555`. @hekmon if you do an `ollama ps` you can see that everything is loaded onto the CPU instead of the GPU, so...

So you can actually do this pretty easily if you have the non-quantized version checked out either from safetensors or the ollama model. There's instructions in the [import doc](https://github.com/ollama/ollama/blob/main/docs/import.md#quantizing-a-model). Short...

I'm not sure if the `pull` resulted in the EOF, or if `ollama run` caused the EOF error. If it was the pull, it was most likely a transient error...

I think there's potentially a misunderstanding of the purpose of the Modelfile. It's not a config file, but more akin to a Makefile or a Dockerfile. It's probably hard to...