frob comments

Results 840 comments of


                                            frob

Not using GPU

``` time=2024-12-05T01:17:11.613-08:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2]" ``` The version you've built doesn't have any runners that use the GPU, only CPU runners.

Not using GPU

Which build guide? Fedora 41 appears to be [not supported yet](https://github.com/ollama/ollama/issues/7869) so building from source may not work yet. If you have docker installed, you could try the docker image.

Not using GPU

Most releases that aren't bleeding edge should work. Fedora 41 was released October 29, 2024 so it will take a little work to make sure all the right dependencies are...

Low GPU usage on second GPU

Ollama has probably done the tensor split sub-optimally. [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging. What are the parameters you use when you run llama.cpp directly?

Low GPU usage on second GPU

What's the token generation rate for both configurations?

Low GPU usage on second GPU

Logs from earlier in the run will show ollama calculations.

Low GPU usage on second GPU

Do you have the logs for this allocation? The logs you posted earlier were from two different runs and it's difficult to piece together the flow.

Low GPU usage on second GPU

More interesting is what ollama thought the state of the GPU was before it tried to allocate layers. The distribution algorithm tries to equalize across GPUs, but doesn't account for...

Low GPU usage on second GPU

If the model is loading and unloading from VRAM it will be recorded in the logs. But this behaviour isn't normal, a screen recording may shed light.

Low GPU usage on second GPU

Can you add the logs for this period?