frob comments

Results 724 comments of


                                            frob

lama runner process has terminated: exit status 0xc0000005 - snowflake-arctic-embed

Failed in `ggml_compute_forward` which is deep inside llama.cpp. There are currently no open tickets in the [issue tracker](https://github.com/ggerganov/llama.cpp/issues), perhaps filing one will get a response from somebody who has seen...

Out of memory on AMD multi GPU instance despite having sufficient VRAM

[Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.

Out of memory on AMD multi GPU instance despite having sufficient VRAM

``` Jun 15 08:10:25 ml-ai-ubuntu-gpu-mi300x1-192gb-atl1 ollama[5704]: time=2025-06-15T08:10:25.798Z level=INFO source=server.go:168 msg=offload library=rocm layers.requested=-1 layers.model=62 layers.offload=62 layers.split="" memory.available="[191.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="167.8 GiB" memory.required.partial="167.8 GiB" memory.required.kv="16.2 GiB" memory.required.allocations="[167.8 GiB]" memory.weights.total="150.0 GiB" memory.weights.repeating="149.3...

Ollama 0.12.10 fails to find CUDA compiler

```console $ mkdir 0.12.10 && cd 0.12.10 && git clone https://github.com/ollama/ollama -b v0.12.10 . && cmake -B build Cloning into '.'... remote: Enumerating objects: 41175, done. remote: Counting objects: 100%...

WIP, test: Cypress tools test

Needs hybrid support in llama.cpp: https://github.com/ggerganov/llama.cpp/pull/7531

"llama3.1:70b does not support tools"

The initial release of the llama3.1 models didn't have a template that supported tools, if you re-pull it you should get the update.

"llama3.1:70b does not support tools"

Not yet, there is an open ticket: https://github.com/ollama/ollama/issues/5794

qwen3-vl parallel request failed

``` time=2025-10-30T15:45:32.702+08:00 level=WARN source=sched.go:397 msg="model architecture does not currently support parallel requests" architecture=qwen3vl ``` https://github.com/ollama/ollama/blob/0a2d92081bb6b6b2d3eab5908fce08cfcf736e1d/server/sched.go#L393-L398

Mix using CPU and GPU even the GTT size is enough to load the Model on GTT of iGPU of AMD Ryzen AI 395

The log is truncated on the right, use this instead: `journalctl -u ollama --no-pager`.

Mix using CPU and GPU even the GTT size is enough to load the Model on GTT of iGPU of AMD Ryzen AI 395

``` Apr 28 17:20:32 GZ302EA ollama[2328]: time=2025-04-28T17:20:32.869+08:00 level=INFO source=server.go:138 msg=offload library=rocm layers.requested=-1 layers.model=33 layers.offload=17 layers.split="" memory.available="[3.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="5.9 GiB" memory.required.partial="3.4 GiB" memory.required.kv="256.0 MiB" memory.required.allocations="[3.4 GiB]" memory.weights.total="4.3 GiB" memory.weights.repeating="3.9...