cortex.cpp Cortex support for AMD GPUs

[X] I have searched the existing issues

Background

Currently, Jan only supports NVIDIA GPUs for acceleration. Users with AMD GPUs, particularly those using eGPUs on Intel Macs, are unable to utilize their graphics hardware for acceleration in Jan. This limits the performance potential for a segment of Jan users who have invested in AMD GPU solutions.

Feature request

[ ] https://github.com/janhq/cortex/issues/323
[ ] https://github.com/janhq/jan/issues/2587
[ ] https://github.com/janhq/docs/issues/15
[x] https://github.com/janhq/jan/issues/3394
[ ] https://github.com/janhq/jan/issues/3530
[ ] https://github.com/janhq/jan/issues/4375

Proposed Implementation

For engineers to fill in

Additional Notes

Consider prioritizing support for popular AMD GPU models like the Vega series initially.

Sep 02 '24 12:09 ux-han

Commenting in support of this feature request

Sep 17 '24 14:09 cracksauce

please support gpu amd discreet on macbook pro 2017-2019

Jan 29 '25 15:01 alsyundawy

Not able to use Jan with amd GPUs on Linux? Sad...

Jan 31 '25 09:01 camilocorreao

Not able to use Jan with amd GPUs on Linux? Sad...

You can try over Vulkan, but it's still was slow. R7 2700 faster that RX6600 over Vulkan (Using Qwen 2.5 32B Q4 model)

Jan 31 '25 21:01 syorito-hatsuki

AMD support is a great idea.

Mar 09 '25 00:03 watermel0l

Hopefully ROCM support can be supported. But those who don't know, you can enabled the admittedly slower Vulkan support.

Go into settings - > Advanced settings and enable "experimental mode" Then ensure that linux-amd64-vulkan is selected under the Local engine settings tab, if not just select it. And you should be running with GPU acceleration, though in my testing (I used LM Studio for ROCM on linux) it seems that the Vulkan version is slightly slower than ROCM. So it'd be nice for ROCM to actually be officially supported in Jan.

Anyways hope this helps :) And appreciate all the hard work the contributors do on this project. So please consider prioritizing this, as a lot of people are using AMD now and days.

Mar 29 '25 06:03 32bitx64bit

So, I was interested to see if my system would even benefit from ROCm support. When benchmarking llama.cpp on Linux using an RX 6800 XT, interestingly Vulkan is performing better than ROCm (at least for the output generation, which is the important part):

Vulkan:

llama-bench -m .config/Jan/data/models/huggingface.co/bartowski/deepcogito_cogito-v1-preview-qwen-14B-GGUF/deepcogito_cogito-v1-preview-qwen-14B-Q6_K_L.gguf
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none

model	size	params	backend	ngl	test	t/s
qwen2 14B Q6_K	11.63 GiB	14.77 B	Vulkan	99	pp512	394.77 ± 0.28
qwen2 14B Q6_K	11.63 GiB	14.77 B	Vulkan	99	tg128	36.82 ± 0.09

ROCm:

llama-bench -m .config/Jan/data/models/huggingface.co/bartowski/deepcogito_cogito-v1-preview-qwen-14B-GGUF/deepcogito_cogito-v1-preview-qwen-14B-Q6_K_L.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 6800 XT, gfx1030 (0x1030), VMM: no, Wave Size: 32

model	size	params	backend	ngl	test	t/s
qwen2 14B Q6_K	11.63 GiB	14.77 B	ROCm	99	pp512	633.67 ± 0.57
qwen2 14B Q6_K	11.63 GiB	14.77 B	ROCm	99	tg128	31.97 ± 0.02

llama.cpp version: 5186 OS: Linux (NixOS) - Kernel 6.14.3 GPU: 1x RX 6800 XT Vulkan Driver: RADV (Mesa 25.0.4) ROCm: 6.3.3

On the other hand I found a report where ROCm was outperforming Vulkan.

The discrepancy could be due to different GPU models/generations or different operating system or the Vulkan implementation got massively improved since version 3818, or something else.

To find out there would be more such benchmark comparisons needed.

Apr 29 '25 17:04 acetux

Also, acceleration for AMD GPUs works perfectly fine in Jan on Linux when the linux-amd64-vulkan engine backend is selected in the settings, and if the documentation is correct it's also supported on Windows. I propose to change the title and description of this issue since the description is incorrect right now.

I just noticed that the names of the backends in the docs don't match the ones in Jan itself and the "Other Accelerators" header should be called "AMD GPU support" to match NVIDIA. That might be another reason for why some people asked for general AMD GPU acceleration support here.

Apr 29 '25 17:04 acetux