Alex Eremin
Alex Eremin
> > Also could you remove GPU specific parts from CPU implementation (more details and in this comment [#3143 (comment)](https://github.com/ROCm/MIOpen/pull/3143#discussion_r1711335230)) And may I ask you to align the test to...
> > Such huge (really huge?) error means that the kernel doesn't perform reduction in acceptable way. > > It’s not technically speaking, unacceptable, it’s just a side effect of...
> the same Q about host-side overhead. > * we know that HIP compilation time is typically 10x longer than OCL. We must ensure that OCL->HIP transition does not lead...
@junliume we are waiting for the perf metrics
It can be safely merged since it does not affect production code. I just don't want to lose this PR.
Could you check all the comments from https://github.com/ROCm/MIOpen/pull/3143, https://github.com/ROCm/MIOpen/pull/3156 and https://github.com/ROCm/MIOpen/pull/3166 and implement them here too. Moreover, I suspect that all these 4 PRs are very similar and actually it...