composable_kernel
composable_kernel copied to clipboard
Disable K-pad checking during main loop
GridwiseGemm implementations guarantees that there won't be out-of-range (padded) addresses along K0/K1 in the main loop. But it currently does not leverage that fact yet
@asroy once suggested adding a hook or something in the A/B tensor descriptor to disable K-pad checking during main loop, but did not rule out that there could be better ways to achieving the same effect.