cortex.cpp icon indicating copy to clipboard operation
cortex.cpp copied to clipboard

Cortex support for AMD GPUs

Open ux-han opened this issue 1 year ago • 8 comments

  • [X] I have searched the existing issues

Background

Currently, Jan only supports NVIDIA GPUs for acceleration. Users with AMD GPUs, particularly those using eGPUs on Intel Macs, are unable to utilize their graphics hardware for acceleration in Jan. This limits the performance potential for a segment of Jan users who have invested in AMD GPU solutions.

Feature request

  • [ ] https://github.com/janhq/cortex/issues/323
  • [ ] https://github.com/janhq/jan/issues/2587
  • [ ] https://github.com/janhq/docs/issues/15
  • [x] https://github.com/janhq/jan/issues/3394
  • [ ] https://github.com/janhq/jan/issues/3530
  • [ ] https://github.com/janhq/jan/issues/4375

Proposed Implementation

For engineers to fill in

Additional Notes

Consider prioritizing support for popular AMD GPU models like the Vega series initially.

ux-han avatar Sep 02 '24 12:09 ux-han

Commenting in support of this feature request

cracksauce avatar Sep 17 '24 14:09 cracksauce

please support gpu amd discreet on macbook pro 2017-2019

alsyundawy avatar Jan 29 '25 15:01 alsyundawy

Not able to use Jan with amd GPUs on Linux? Sad...

camilocorreao avatar Jan 31 '25 09:01 camilocorreao

Not able to use Jan with amd GPUs on Linux? Sad...

You can try over Vulkan, but it's still was slow. R7 2700 faster that RX6600 over Vulkan (Using Qwen 2.5 32B Q4 model)

syorito-hatsuki avatar Jan 31 '25 21:01 syorito-hatsuki

AMD support is a great idea.

watermel0l avatar Mar 09 '25 00:03 watermel0l

Hopefully ROCM support can be supported. But those who don't know, you can enabled the admittedly slower Vulkan support.

Go into settings - > Advanced settings and enable "experimental mode" Then ensure that linux-amd64-vulkan is selected under the Local engine settings tab, if not just select it. And you should be running with GPU acceleration, though in my testing (I used LM Studio for ROCM on linux) it seems that the Vulkan version is slightly slower than ROCM. So it'd be nice for ROCM to actually be officially supported in Jan.

Anyways hope this helps :) And appreciate all the hard work the contributors do on this project. So please consider prioritizing this, as a lot of people are using AMD now and days.

32bitx64bit avatar Mar 29 '25 06:03 32bitx64bit

So, I was interested to see if my system would even benefit from ROCm support. When benchmarking llama.cpp on Linux using an RX 6800 XT, interestingly Vulkan is performing better than ROCm (at least for the output generation, which is the important part):

Vulkan:

llama-bench -m .config/Jan/data/models/huggingface.co/bartowski/deepcogito_cogito-v1-preview-qwen-14B-GGUF/deepcogito_cogito-v1-preview-qwen-14B-Q6_K_L.gguf
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none
model size params backend ngl test t/s
qwen2 14B Q6_K 11.63 GiB 14.77 B Vulkan 99 pp512 394.77 ± 0.28
qwen2 14B Q6_K 11.63 GiB 14.77 B Vulkan 99 tg128 36.82 ± 0.09

ROCm:

llama-bench -m .config/Jan/data/models/huggingface.co/bartowski/deepcogito_cogito-v1-preview-qwen-14B-GGUF/deepcogito_cogito-v1-preview-qwen-14B-Q6_K_L.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32
model size params backend ngl test t/s
qwen2 14B Q6_K 11.63 GiB 14.77 B ROCm 99 pp512 633.67 ± 0.57
qwen2 14B Q6_K 11.63 GiB 14.77 B ROCm 99 tg128 31.97 ± 0.02

llama.cpp version: 5186 OS: Linux (NixOS) - Kernel 6.14.3 GPU: 1x RX 6800 XT Vulkan Driver: RADV (Mesa 25.0.4) ROCm: 6.3.3

On the other hand I found a report where ROCm was outperforming Vulkan.

The discrepancy could be due to different GPU models/generations or different operating system or the Vulkan implementation got massively improved since version 3818, or something else.

To find out there would be more such benchmark comparisons needed.

acetux avatar Apr 29 '25 17:04 acetux

Also, acceleration for AMD GPUs works perfectly fine in Jan on Linux when the linux-amd64-vulkan engine backend is selected in the settings, and if the documentation is correct it's also supported on Windows. I propose to change the title and description of this issue since the description is incorrect right now.

I just noticed that the names of the backends in the docs don't match the ones in Jan itself and the "Other Accelerators" header should be called "AMD GPU support" to match NVIDIA. That might be another reason for why some people asked for general AMD GPU acceleration support here.

acetux avatar Apr 29 '25 17:04 acetux