[QST] Support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling
In 67_hopper_fp8_warp_specialized_gemm_with_groupwise_scaling, if the params are changed like this
constexpr int ScaleGranularityM = 128;
constexpr int ScaleGranularityN = 128;
constexpr int ScaleGranularityK = 1;
dose it support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling?
The background is discussed at https://github.com/deepseek-ai/DeepGEMM/issues/10
dose it support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling?
yes
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.