qwen.cpp Support for AMD‘s ROCm

Support for AMD‘s ROCm

Open riverzhou opened this issue 1 year ago • 5 comments

Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?

Nov 23 '23 07:11 riverzhou

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Nov 25 '23 02:11 CellerX

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Thanks!

Nov 30 '23 10:11 riverzhou

I modified ggml framework and make it support ROCm. And add ROCm support for qwen.cpp

https://github.com/riverzhou/qwen.cpp

Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.

Dec 01 '23 08:12 riverzhou

I take a pull requests to upstream ggml and it's merged just now. For now, just add

if (GGML_HIPBLAS)
  add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
  set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()

to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.

Dec 01 '23 09:12 riverzhou

can it support AMD Rocm on Windows?

Dec 04 '23 07:12 louwangzhiyuY

qwen.cpp qwen.cpp copied to clipboard

Support for AMD‘s ROCm

qwen.cpp
qwen.cpp copied to clipboard