Max Zanoga

Results 9 comments of Max Zanoga

After reinstalling Ollama. It starts to offload into GPU. But I received a new problem: > time=2024-04-26T19:26:29.174Z level=DEBUG source=server.go:420 msg="server not yet available" error="health resp: Get \"http://127.0.0.1:38459/health\": dial tcp 127.0.0.1:38459:...

Yes, I have 5 GPUs ( 5 x WX9100 ) in my system. ls /sys/class/kfd/kfd/topology/nodes/ 0 1 2 3 4 5 lspci | grep VGA 03:00.0 VGA compatible controller: Advanced...

I tried to run ollama with just one GPU: OLLAMA_DEBUG=1 HIP_VISIBLE_DEVICES=1 ollama serve as a result: > time=2024-05-02T21:54:36.454Z level=INFO source=gpu.go:314 msg="Discovered GPU libraries: []" > time=2024-05-02T21:54:36.454Z level=INFO source=cpu_common.go:15 msg="CPU has...

**After update:** > ollama -v > ollama version is 0.1.33 **The same problem:** > .time=2024-05-03T00:51:14.618Z level=DEBUG source=server.go:466 msg="server not yet available" error="health resp: Get \"http://127.0.0.1:43461/health\": dial tcp 127.0.0.1:43461: i/o timeout"...

I can provide access to the server for debugging if it's necessary.

In my case llama.cpp works great. I run 5 processes of llama.cpp each on a different GPU. Thank you for your help. I think it's not critical to support so...

Any updates? I try to collectstatic and received the same error: `Post-processing 'vendor/bootswatch/default/bootstrap.min.css' failed!`

steps: > git clone https://github.com/turboderp/exllama > cd exllama > pip install -r requirements.txt > python test_benchmark_inference.py -d -p -ppl result > python test_benchmark_inference.py -d /home/dev/models/Mistral-7B-Instruct-v0.2-GPTQ/ -p -ppl > Successfully preprocessed...

Maybe this gives more information about an error: **gdb --args python3 test_benchmark_inference.py -d /home/dev/models/Mistral-7B-Instruct-v0.2-GPTQ/ -p -ppl** > #0 0x00007fff4e89540e in rocblas_hgemm () from /home/dev/workspace/numpy_no_avx2/venv/lib/python3.10/site-packages/torch/lib/librocblas.so > #1 0x00007fff86e491dd in hipblasHgemm ()...