ramalama icon indicating copy to clipboard operation
ramalama copied to clipboard

RamaLama won't recognize RX5700XT

Open Split7fire opened this issue 11 months ago • 18 comments

The whole story starts in @RealVishy comment in #2503. That comment said, that RX5700XT is working good with RamaLama on Linux.

I'm using Bluefin-dx and tried to run RamaLama with no avail. I created an issue here ublue-os/bluefin#2197. And got an suggestion to post an issue here.

I'm using RamaLama bundled with distro:

❯ /usr/bin/ramalama -v
ramalama version 0.5.2

Testing:

/usr/bin/ramalama --debug run llama3.2
run_cmd:  podman inspect quay.io/ramalama/rocm:0.5
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_Eef0KsY5uh --pull=newer -t --device /dev/dri --device /dev/kfd -e HIP_VISIBLE_DEVICES=0 --mount=type=bind,src=/var/home/vlad/.local/share/ramalama/models/ollama/llama3.2:latest,destination=/mnt/models/model.file,ro quay.io/ramalama/rocm:latest llama-run -c 2048 --temp 0.8 -v /mnt/models/model.file
Loading modelggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no                                                                                                                     
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 5700 XT, gfx1010:xnack- (0x1010), VMM: no, Wave Size: 32

~ took 5s 

Any help appreciated.

Split7fire avatar Feb 13 '25 13:02 Split7fire

Maybe you need this fix?

https://github.com/containers/ramalama/pull/802

ericcurtin avatar Feb 13 '25 14:02 ericcurtin

Maybe you need this fix?

#802

Thanks for your suggestion. I tried:

❯ /usr/bin/ramalama --debug --ngl=999 run llama3.2
usage: ramalama [-h] [--container] [--debug] [--dryrun] [--engine ENGINE] [--gpu] [--image IMAGE] [--nocontainer] [--runtime {llama.cpp,vllm}] [--store STORE] [-v]
                {help,containers,ps,convert,info,list,ls,login,logout,pull,push,rm,run,serve,stop,version} ...
ramalama: error: unrecognized arguments: --ngl=999

~ 

Split7fire avatar Feb 13 '25 14:02 Split7fire

It's --ngl 999 rather than --ngl=999

ericcurtin avatar Feb 13 '25 15:02 ericcurtin

It's --ngl 999 rather than --ngl=999

Yeah, I tried that too

❯ /usr/bin/ramalama --debug --ngl 999 run llama3.2
usage: ramalama [-h] [--container] [--debug] [--dryrun] [--engine ENGINE] [--gpu] [--image IMAGE] [--nocontainer] [--runtime {llama.cpp,vllm}] [--store STORE] [-v]
                {help,containers,ps,convert,info,list,ls,login,logout,pull,push,rm,run,serve,stop,version} ...
ramalama: error: argument subcommand: invalid choice: '999' (choose from help, containers, ps, convert, info, list, ls, login, logout, pull, push, rm, run, serve, stop, version)

~ 

Split7fire avatar Feb 13 '25 15:02 Split7fire

You need to put it after the run command I think

ericcurtin avatar Feb 13 '25 15:02 ericcurtin

You need to put it after the run command I think

Nope.

❯ /usr/bin/ramalama --debug run --ngl 999 llama3.2
usage: ramalama [-h] [--container] [--debug] [--dryrun] [--engine ENGINE] [--gpu] [--image IMAGE] [--nocontainer] [--runtime {llama.cpp,vllm}] [--store STORE] [-v]
                {help,containers,ps,convert,info,list,ls,login,logout,pull,push,rm,run,serve,stop,version} ...
ramalama: error: unrecognized arguments: --ngl

~ 

Split7fire avatar Feb 13 '25 16:02 Split7fire

Can you try updating the version of ramalama, this ngl thing was added recently enough

ericcurtin avatar Feb 13 '25 17:02 ericcurtin

@ericcurtin Well after my distro updates arrived (Bluefin-dx), I rerun ramalam with no success.

❯ ramalama -v
ramalama version 0.5.5
❯ /usr/bin/ramalama --debug run llama3.2
run_cmd:  podman inspect quay.io/ramalama/rocm:0.5
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_wHwsqJYifh --pull=newer -t --device /dev/dri --device /dev/kfd -e HIP_VISIBLE_DEVICES=0 --mount=type=bind,src=/var/home/vlad/.local/share/ramalama/models/ollama/llama3.2:latest,destination=/mnt/models/model.file,ro quay.io/ramalama/rocm:latest llama-run -c 2048 --temp 0.8 -v --ngl 999 /mnt/models/model.file
Loading modelggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 5700 XT, gfx1010:xnack- (0x1010), VMM: no, Wave Size: 32

~ took 5s 

Any further ideas?

Split7fire avatar Feb 22 '25 02:02 Split7fire

Could you paste the full "--debug" output?

Also what are you using to check if the GPU is being utilised? nvtop?

5 seconds reasonable to initialized a GPU.

Might be worth trying @maxamillion 's Fedora-based container images or Vulkan also.

ericcurtin avatar Feb 22 '25 14:02 ericcurtin

@ericcurtin This is already a "full" debug output.

I presume a kind of input commandline should appear. Also I tested benchwith similar results:

❯ ramalama --debug bench llama3.2
run_cmd:  podman inspect quay.io/ramalama/rocm:0.5
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label RAMALAMA --security-opt=label=disable --name ramalama_hsYIHYxm4m --pull=newer -t --device /dev/dri --device /dev/kfd -e HIP_VISIBLE_DEVICES=0 --mount=type=bind,src=/var/home/vlad/.local/share/ramalama/models/ollama/llama3.2:latest,destination=/mnt/models/model.file,ro quay.io/ramalama/rocm:latest llama-bench -ngl 999 -m /mnt/models/model.file
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 5700 XT, gfx1010:xnack- (0x1010), VMM: no, Wave Size: 32

~ took 4s 

Might be worth trying @maxamillion 's Fedora-based container images or Vulkan also. How I can do that?

Split7fire avatar Feb 22 '25 15:02 Split7fire

@Split7fire seems like llama-run/llama-bench is crashing then, you'll need to debug this in the llama.cpp layer

ericcurtin avatar Feb 22 '25 15:02 ericcurtin

@ericcurtin it's getting more and more obscure... @RealVishy stated working ramalama with this kind of hardware on similar atomic desktop, but I can not reproduce this. Also, are rx5700xt are supported by ROCm? As far as I know latest ROCm has no support for gfx1010. Are Ramalama using it own ROCm layer above official one?

Split7fire avatar Feb 23 '25 04:02 Split7fire

@Split7fire still having this issue?

rhatdan avatar Apr 02 '25 10:04 rhatdan

@rhatdan Certainly, yes. Just to be sure, I retest ramalama --debug bench llama3.2 and got this:

run_cmd:  podman inspect quay.io/ramalama/rocm:0.6
Working directory: None
Ignore stderr: False
Ignore all: True
exec_cmd:  podman run --rm -i --label ai.ramalama --name ramalama_OUacPYIFEh --env=HOME=/tmp --init --security-opt=label=disable --cap-drop=all --security-opt=no-new-privileges --label ai.ramalama.model=llama3.2 --label ai.ramalama.engine=podman --label ai.ramalama.runtime=llama.cpp --label ai.ramalama.command=bench --pull=newer -t --device /dev/dri --device /dev/kfd -e HIP_VISIBLE_DEVICES=0 --network none --mount=type=bind,src=/var/home/vlad/.local/share/ramalama/models/ollama/llama3.2:latest,destination=/mnt/models/model.file,ro quay.io/ramalama/rocm:latest llama-bench -ngl 999 -m /mnt/models/model.file
Trying to pull quay.io/ramalama/rocm:latest...
Getting image source signatures
Copying blob 0159dca2e5b7 done   | 
Copying blob 23bf9faaf948 done   | 
Copying blob 23f6dbb37a63 done   | 
Copying config 04bfb0587d done   | 
Writing manifest to image destination
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 5700 XT, gfx1010:xnack- (0x1010), VMM: no, Wave Size: 32

and nothing else.

Split7fire avatar Apr 06 '25 14:04 Split7fire

Could you update the version of RamaLama you are using, seems like ramalama 0.6.* We are getting ready to release 0.7.3?

rhatdan avatar Apr 07 '25 16:04 rhatdan

@rhatdan I'm using ramalama from my distro (Bluefin-dx). So I tied to Bluefins release cycle. I tried to install via pip but it failed.

Split7fire avatar Apr 07 '25 16:04 Split7fire

Ok it looks like you are using the latest image, with an older ramalama, Not sure this makes a difference.

rhatdan avatar Apr 07 '25 19:04 rhatdan

@maxamillion PTAL

rhatdan avatar Apr 07 '25 19:04 rhatdan

Some update: Since Feb I managed to reinstall Aurora on my device but nothing changed. I'm open to any suggestions. My current software stack:

KDE Plasma Version: 6.3.5
KDE Frameworks Version: 6.14.0
Qt Version: 6.9.0
Kernel Version: 6.14.5-300.fc42.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 12 × Intel® Xeon® CPU E5-1650 0 @ 3.20GHz
Memory: 67.3 GB of RAM
Graphics Processor: AMD Radeon RX 5700 XT
Manufacturer: HUANANZHI

Split7fire avatar Jun 15 '25 11:06 Split7fire

I recall Aurora/Bluefin comes with Linuxbrew installed. You can grab latest ramalama using: brew install ramalama in that case.

alaviss avatar Jun 15 '25 19:06 alaviss

Can I close this issue?

rhatdan avatar Jul 22 '25 14:07 rhatdan

@rhatdan, actually ramalama still won't recognize RX5700XT. From time to time I test ramalama with no luck

Split7fire avatar Jul 25 '25 06:07 Split7fire

Was this ever debugged with llama.cpp?

rhatdan avatar Jul 25 '25 10:07 rhatdan

I really appreciate any hints in debugging. P.S. I also tried llama.cpp alone with no luck either. It just exit and that's all.

Split7fire avatar Jul 25 '25 12:07 Split7fire

Please open an issue there, they are more likely to know what is going on.

rhatdan avatar Jul 26 '25 21:07 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Aug 27 '25 00:08 github-actions[bot]

Do not know the state of llama.cpp, but this is not something RamaLama can fix itself, so closing.

rhatdan avatar Aug 27 '25 10:08 rhatdan

As a remark to issue: LLama.cpp works great with RX5700XT if downloaded from release page of llama.cpp. So it may be RamaLama issue. If someone has a guide how to debug this kind of issue, please, share.

Split7fire avatar Aug 27 '25 13:08 Split7fire

How did you build it, perhaps we need to update the llama.cpp version we are using.

rhatdan avatar Aug 28 '25 11:08 rhatdan