fix rocm build error
fix for https://github.com/kvcache-ai/ktransformers/issues/1178
Looks Good, and one suggestion for runtime exception: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib
--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
+++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu
@@ -74,8 +74,8 @@ namespace gptq_marlin {
torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight,
torch::Tensor& b_scales, torch::Tensor& g_idx,
torch::Tensor& perm, torch::Tensor& workspace,
- int64_t num_bits, int64_t size_m, int64_t size_n,
- int64_t size_k, bool is_k_full) {
+ int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n,
+ int64_t size_k, int sms, bool is_k_full) {
TORCH_CHECK_NOT_IMPLEMENTED(false,
"marlin_gemm(..) requires CUDA_ARCH >= 8.0");
return torch::empty({ 1, 1 });
Looks Good, and one suggestion for runtime exception: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib
--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu +++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu @@ -74,8 +74,8 @@ namespace gptq_marlin { torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight, torch::Tensor& b_scales, torch::Tensor& g_idx, torch::Tensor& perm, torch::Tensor& workspace, - int64_t num_bits, int64_t size_m, int64_t size_n, - int64_t size_k, bool is_k_full) { + int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n, + int64_t size_k, int sms, bool is_k_full) { TORCH_CHECK_NOT_IMPLEMENTED(false, "marlin_gemm(..) requires CUDA_ARCH >= 8.0"); return torch::empty({ 1, 1 });
I sincerely apologize. This patch only addresses compilation issues. Since vLLMMarlin wasn't supported, I had commented it out in my local environment. Alternatively, we could move the import vLLMMarlin statement to the actual usage location. I'll update the patch shortly.
Looks Good, and one suggestion for runtime exception: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_lS0_lllib
--- a/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu +++ b/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu @@ -74,8 +74,8 @@ namespace gptq_marlin { torch::Tensor gptq_marlin_gemm(torch::Tensor& a, torch::Tensor& b_q_weight, torch::Tensor& b_scales, torch::Tensor& g_idx, torch::Tensor& perm, torch::Tensor& workspace, - int64_t num_bits, int64_t size_m, int64_t size_n, - int64_t size_k, bool is_k_full) { + int64_t num_bits, torch::Tensor size_m_tensor, int64_t size_m, int64_t size_n, + int64_t size_k, int sms, bool is_k_full) { TORCH_CHECK_NOT_IMPLEMENTED(false, "marlin_gemm(..) requires CUDA_ARCH >= 8.0"); return torch::empty({ 1, 1 });
@kevin-t-tang Just updated – could you please review? Thanks!