frob
frob
Have you retried the download?
Your edit indicates that you have a working system, you've just got the format of the request wrong. ```console curl http://localhost:11434/api/generate -d '{ "model": "llama3.2:3b", "prompt": "What is 9 +...
Please post a full log, earlier parts of the log will include information about device detection, etc.
So it identified the devices and actually did a few completions ``` [GIN] 2025/02/18 - 00:07:48 | 200 | 2.551880862s | 10.89.0.4 | POST "/api/chat" [GIN] 2025/02/18 - 00:07:59 |...
> My CPU is a 9700K (Coffe Lake), so it should be much more recent than the Ivy Bridge chips Yep, `lscpu` shows that the host CPU supports `rdrand`, so...
Never mind, wrong opcode, it's actually RDPID, not RDRAND. ``` 0000000000000000 : 0: f3 0f c7 f8 rdpid %rax 4: 25 ff 03 00 00 and $0x3ff,%eax 9: 48 8b...
I think your best bet is to file ticket with [ipex-llm](https://github.com/intel/ipex-llm/issues). They have a [Dockerfile](https://github.com/intel/ipex-llm/blob/main/docker/llm/inference-cpp/Dockerfile) for building the container image so I thought it might be possible to pull a...
It's a bug in the memory estimation logic. You have `OLLAMA_GPU_OVERHEAD=1G` and at the point that ollama is trying to find space to fit the model, the cuda device is...
Setting `CUDA_VISIBLE_DEVICES=-1` on Windows sometimes causes problems with the Nvidia driver: #9836. A safer way of restricting the model to the CPU is to set `num_gpu:0` as described [here](https://github.com/ollama/ollama/issues/9836#issuecomment-2731254084).