frob
frob
sucream is correct, the CUDA library is being loaded from the wrong location. ``` Oct 17 13:25:55 pony ollama[4149997]: load_backend: loaded CUDA backend from /usr/lib/ollama/libggml-cuda.so ``` The recommendation is to...
> Also, it's built with -DGGML_BACKEND_DIR=%_libexecdir/ollama so the location is intended. Perhaps, but the library usually lives in a cuda_v12 directory. If your build process is not preserving the directory...
What's the output of the following commands: ``` command -v ollama find /usr/lib/ollama find $(dirname $(dirname $(command -v ollama)))/lib/ollama ```
> incorrect just because it's somehow different It's incorrect because it doesn't work, not because it's different. Different is fine. Arch, for example, has a different build and it works...
Would you please stop spamming issues with your PR. It has no relevance to this one or the others you posted it to.
Performance decrease for CPU only is likely https://github.com/ollama/ollama/issues/12886.
You cannot run a model directly from safetensors in ollama. The process of importing the model converts it to a GGUF quantized to FP16. The converter is based on code...
> > You cannot run a model directly from safetensors in ollama. > > I can, you already see multiple users doing just that. Just because ollama fails to print...
Mistral Nemo is supported, importing the safetensor version of the model via ollama is not. Modelfile allows setting extra parameters, like `num_gpu` in the original post. The `create` command does...