qwen.cpp icon indicating copy to clipboard operation
qwen.cpp copied to clipboard

Support for AMD‘s ROCm

Open riverzhou opened this issue 1 year ago • 5 comments

Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?

riverzhou avatar Nov 23 '23 07:11 riverzhou

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

CellerX avatar Nov 25 '23 02:11 CellerX

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Thanks!

riverzhou avatar Nov 30 '23 10:11 riverzhou

image

I modified ggml framework and make it support ROCm. And add ROCm support for qwen.cpp

https://github.com/riverzhou/qwen.cpp

Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.

riverzhou avatar Dec 01 '23 08:12 riverzhou

I take a pull requests to upstream ggml and it's merged just now. For now, just add

if (GGML_HIPBLAS)
  add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
  set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()

to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.

riverzhou avatar Dec 01 '23 09:12 riverzhou

can it support AMD Rocm on Windows?

louwangzhiyuY avatar Dec 04 '23 07:12 louwangzhiyuY