cutlass [QST] Support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling

In 67_hopper_fp8_warp_specialized_gemm_with_groupwise_scaling, if the params are changed like this

constexpr int ScaleGranularityM = 128; 
constexpr int ScaleGranularityN = 128;
constexpr int ScaleGranularityK = 1;

dose it support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling?

May 05 '25 05:05 hxdtest

The background is discussed at https://github.com/deepseek-ai/DeepGEMM/issues/10

May 05 '25 05:05 hxdtest

dose it support fp8 gemm with 128x1 LHS scaling and 1x128 RHS scaling?

yes

May 06 '25 18:05 hwu36

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Jun 05 '25 18:06 github-actions[bot]

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

Sep 03 '25 19:09 github-actions[bot]