kokkos-kernels icon indicating copy to clipboard operation
kokkos-kernels copied to clipboard

Fix the behavior of extent(r) for r > rank in TeamGEMM

Open yasahi-hpc opened this issue 8 months ago • 10 comments

May partially resolve #2622 After looking at the bahaviors of older versions, I found that extent(r) for r > rank should return 1

This PR fixes the behavior.

yasahi-hpc avatar Jun 04 '25 15:06 yasahi-hpc

I don't know the precedence of || and ternary offhand, could you please insert parens (even if not technically necessary)

Sure

yasahi-hpc avatar Jun 04 '25 17:06 yasahi-hpc

Can we add a small unit-test that exhibit the current issue and shows that the proposed PR fixes it?

If this is related, it means that there is a GEMM operation on rank 0 view. I can add a test for that in https://github.com/kokkos/kokkos-kernels/pull/2628

yasahi-hpc avatar Jun 04 '25 17:06 yasahi-hpc

I'm building the Ifpack2 tests to check the impact of this PR, thanks for tracking this down @yasahi-hpc

ndellingwood avatar Jun 04 '25 17:06 ndellingwood

Unfortunately I still see the failures in Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 when building Trilinos with this PR, tested in a Cuda build

ndellingwood avatar Jun 04 '25 18:06 ndellingwood

Unfortunately I still see the failures in Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 when building Trilinos with this PR, tested in a Cuda build

Thanks a lot for testing and reporting. That is a bit unfortunate. I will continue investigation

yasahi-hpc avatar Jun 05 '25 08:06 yasahi-hpc

Hi @yasahi-hpc , I retested with 9f1b00a904b44828cefdcd54f9f7c46908c7b27a in a Cuda build but I am still seeing failures with the Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 test

ndellingwood avatar Jun 13 '25 17:06 ndellingwood

Hi @yasahi-hpc , I retested with 9f1b00a in a Cuda build but I am still seeing failures with the Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 test

Thank you for testing. That is unfortunate.

I will investigate on my side. First of all, I need to build Trilinos on my environment.

yasahi-hpc avatar Jun 16 '25 07:06 yasahi-hpc

@lucbv Can I close this and start working again #2628 and #2651 ?

yasahi-hpc avatar Oct 30 '25 19:10 yasahi-hpc

Yes, I think we can, sorry we took a while to make the final decision on how to move forward. The code changes in GEMM led to issues with CUDA and HIP so we might want to re-introduce things more slowly, with more smaller PRs...

lucbv avatar Oct 30 '25 19:10 lucbv

Thank you for the information So, better not to work on #2628 and #2651

yasahi-hpc avatar Oct 30 '25 19:10 yasahi-hpc