composable_kernel
composable_kernel copied to clipboard
[CK_TILE] Merge multiple fwd convolution groups into a single GEMM batch.
Proposed changes
Added merging of multiple forward convolution groups into a single GEMM batch. The majority of the required components were already available and the only major code changes are in the group offset calculations in the CK Tile grouped forward convolution kernel.