frob comments

Results 806 comments of


                                            frob

Ollama in docker container fails to Initiate gpu after some idle time.

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#amd-gpu-discovery:~:text=%22exec%2Dopts%22%3A%20%5B%22native.cgroupdriver%3Dcgroupfs%22%5D

Ollama in docker container fails to Initiate gpu after some idle time.

Have you tried setting `native.cgroupdriver` as in the provided link?

Ollama in docker container fails to Initiate gpu after some idle time.

```console $ cat /etc/docker/daemon.json { "runtimes": { "nvidia": { "args": [], "path": "nvidia-container-runtime" } }, "exec-opts": ["native.cgroupdriver=cgroupfs"] } ```

Running out of memory when allocating to second GPU

Please post the full log, there are earlier log lines about device detection and memory calculations that may be relevant. Also set `OLLAMA_DEBUG=1` in the [server environment](https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-windows), it may give...

Running out of memory when allocating to second GPU

Unfortunately it's not clear why the alloc failed, it seems like there should be plenty of VRAM available. The 0xc0000409 (STATUS_STACK_BUFFER_OVERRUN) exit status suggests that recovery from the failed alloc...

Running out of memory when allocating to second GPU

Create a new model: ```sh $ ollama show --modelfile llama3.1:70b-instruct-q3_K_L | sed -e 's/^FROM.*/FROM llama3.1:70b-instruct-q3_K_L/' > Modelfile $ echo "PARAMETER num_gpu 75" >> Modelfile $ ollama create llama3.1:70b-instruct-q3_K_L-ng75 $ ollama...

frob

Ollama in docker container fails to Initiate gpu after some idle time.

Ollama in docker container fails to Initiate gpu after some idle time.

Ollama in docker container fails to Initiate gpu after some idle time.

Running out of memory when allocating to second GPU

Running out of memory when allocating to second GPU

Running out of memory when allocating to second GPU

Gemma2 and Mistral-nemo not running on ollama

load_tensors: tensor 'token_embd.weight' cannot be used with preferred buffer type CUDA_Host, using CPU instead

ollama miscalulates CUDA memory to allocate (deepseek-r1:671b)

ollama miscalulates CUDA memory to allocate (deepseek-r1:671b)