composable_kernel
composable_kernel copied to clipboard
Backward weight convolution kernel error when enable profiling
We found the backward weight convolution kernels will lead to errors when enable profiling for the ck invoker run() functions, which made the ck-based solver failed in MIOpen.
We have the following observation: the ck-profiling is enabled, the ck wrw kernel introduces the errors that cause the precision issue. The ck-profiling is disabled, the result is correct but the time will be 0 in this case.
This lead to a situation: we can either get the correct profiling information OR get the correct result from ck, but not both of them.
I have proposed a workaround: PR2770 in MIOpen to address this issue. It would be nice if CK can do some fix to let us get both the profiling info and result simultaneously.
@iq136boy Is this fixed in the latest ROCm 6.2? If so, please close the ticket. Thanks!
Close the ticket as @iq136boy will use their own profiling tool.