frob comments

Results 840 comments of


                                            frob

Cannot run on CPU only

There's currently no environment variable that allows configuring that in the service file. A hacky way to achieve this is to prevent ollama from loading the CUDA library: ``` sudo...

timed out waiting for llama runner to start - progress 0.00 -

Set `OLLAMA_LOAD_TIMEOUT=30m` in the server environment.

timed out waiting for llama runner to start - progress 0.00 -

Cloud server? https://github.com/ollama/ollama/issues/9292#issuecomment-2676702560

ollama 0.5.1 is detecting my NVIDIA Tesla M40, but they are not used.

ollama [0.5.2-rc3](https://github.com/ollama/ollama/releases/tag/v0.5.2-rc3) bumps to a new version of llama.cpp, does that use the M40s?

ollama 0.5.1 is detecting my NVIDIA Tesla M40, but they are not used.

ollama 0.5.1 is detecting my NVIDIA Tesla M40, but they are not used.

Does that mean setting `OLLAMA_LLM_LIBRARY=cuda_v11` would use all devices?

⚠️ CPU limits GPU computation because something is monothreaded (benchmark inside the issue)

The IPC mechanism between the CPU and GPU is a busy wait, which is why the CPU is at 100%. This has been [discussed](https://github.com/ggml-org/llama.cpp/issues/8684) by people who understand what the...

max retries exceeded: unexpected EOF

Server logs would help diagnose the issue. Since larger models require more connections and more data transfer, they're more susceptible to interruptions.

max retries exceeded: unexpected EOF

If you are happy to treat it as a transient event, close the ticket. If it happens again, re-open it and add some server logs.

Llama3.2-vision Run Error

Vision support was merged recently (https://github.com/ollama/ollama/pull/6963), 0.3.14 doesn't include it.