cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

[QST]Why we have three GEMM in cutlass

Open ziyuhuang123 opened this issue 6 months ago • 1 comments

What is your question? https://github.com/NVIDIA/cutlass/blob/f7b19de32c5d1f3cedfc735c2849f12b537522ee/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss_warpspecialized.hpp#L477-L554

I understand that parts 2 and 3 correspond to k_iter's 0 and [1, k_end), respectively. However, what is the purpose of part 1? Why does it iterate over k_block? (Based on testing, part 1 is indeed entered several times, and if part 1 is commented out, the result is incorrect.)

Image

ziyuhuang123 avatar Aug 28 '24 11:08 ziyuhuang123