composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

[CK Tile] Need GroupGEMM with make_argument on device

Open zjing14 opened this issue 10 months ago • 1 comments

Currently, CK Tile GroupGEMM prepares metadata on the host, which requires transferring meta data between the device and host back and forth. https://github.com/ROCm/composable_kernel/blob/6b6fcd370bb2e5572422a1ca71d261df02b6263e/include/ck_tile/ops/gemm/kernel/grouped_gemm_kernel.hpp#L98

To avoid this overhead, we need groupGEMM kernel as old CK, which is a persistent kernel that reads GEMM shapes from device memory, and calculate offset and block_id on-the-fly.

zjing14 avatar Feb 14 '25 20:02 zjing14

@aosewski Could you help? Move your groupGEMM design from old CK to CK Tile

zjing14 avatar Feb 14 '25 20:02 zjing14