qwen.cpp
qwen.cpp copied to clipboard
Support for AMD‘s ROCm
Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?
https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models
https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models
Thanks!
I modified ggml framework and make it support ROCm. And add ROCm support for qwen.cpp
https://github.com/riverzhou/qwen.cpp
Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.
I take a pull requests to upstream ggml and it's merged just now. For now, just add
if (GGML_HIPBLAS)
add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()
to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.
can it support AMD Rocm on Windows?