frob

Results 767 comments of


                                            frob

docker container can't detect Nvidia GPU - intermittent "cuda driver library failed to get device context 801"

GPU utilization will vary on the efficiency of the model and external factors like power usage, etc. I don't know how this applies to VGPUs as they are a shared...

docker container can't detect Nvidia GPU - intermittent "cuda driver library failed to get device context 801"

In the server log there will be lines like: ``` ollama | time=2024-08-14T22:59:28.178Z level=INFO source=memory.go:309 msg="offload to cuda" layers.requested=-1 layers.model=81 layers.offload=14 layers.split="" memory.available="[11.6 GiB]" memory.required.full="55.9 GiB" memory.required.partial="11.4 GiB" memory.required.kv="640.0 MiB"...

docker container can't detect Nvidia GPU - intermittent "cuda driver library failed to get device context 801"

What's in the server logs when it fails?

Unable to Pull Model Manifest - "Get https://registry.ollama.ai/v2/library/llama3/manifests/latest: EOF"

This is probably a transient issue, there are reports that Cloudfare has been a bit flaky.

Segmentation fault

What are the results of the following commands: `which ollama` `ldd ollama` `strace ollama -v`

Segmentation fault

What's the result of these commands: `command -v ollama` `ldd /usr/local/bin/ollama`

Very slow API generate endpoint

[Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) would help in debugging. What's the value of `$Modelname`?

Very slow API generate endpoint

If you add `OLLAMA_DEBUG=1` to the server environment the runner will print slot processing which may give insight in to what's causing the long processing times. Just to verify, the...

ollama slower than llama.cpp

[Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.

ollama slower than llama.cpp

It's quite possible that the difference in build environment can be an effect. Note however that you are not comparing the same model: llama.cpp is using gemma-2-2b-it-Q4_K_M.gguf and ollama is...

‹
1
2
3
4
5
6
7
8
9
10
...
76
77
›