composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

[CK_TILE] Merge multiple fwd convolution groups into a single GEMM batch.

Open vpietila-amd opened this issue 1 month ago • 0 comments

Proposed changes

Added merging of multiple forward convolution groups into a single GEMM batch. The majority of the required components were already available and the only major code changes are in the group offset calculations in the CK Tile grouped forward convolution kernel.

vpietila-amd avatar Oct 31 '25 12:10 vpietila-amd