frob
frob
See [here](https://github.com/ollama/ollama/issues/5913#issuecomment-2248262520) for a way to change the default `num_gpu` value for a model, so that you don't need to `/set parameter num_gpu xx` every time you load a model.
It's easier to debug if the full log is available.
[Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging. Does it get in this state if `OLLAMA_NUM_PARALLEL=1`? Does the size of the CPU processes increase over time?
> > OLLAMA_NUM_PARALLEL=1 > > @rick-github When and where should I apply this setting? Depends on how you installed ollama. If you did `curl -fsSL https://ollama.com/install.sh | sh`, then in...
``` ollama version is 0.0.0 ``` Using an official release makes it easier to debug. ``` pulling 939fd971f038... 100% ▕███████████▏ 228 GB ``` This is a large model, do you...
ollama is already listed as an Inference provider in twinny.
What's the output of nvidia-smi outside of the container?
~If you (temporarily) install ollama as a service (`curl -fsSL https://ollama.com/install.sh | sh`) can it access the GPU?~ I see that you've already done that.
nvidia-smi outside of the container shows an ollama runner using the GPU. Is that running inside the container or is the ollama-as-a-service still running?
What do the following show: `pstree -ls 3576` `ps wwp3576`