kokkos-kernels icon indicating copy to clipboard operation
kokkos-kernels copied to clipboard

KokkosBlas::gemm operates incorrectly in corner case

Open kddevin opened this issue 4 years ago • 5 comments

https://github.com/trilinos/Trilinos/issues/9583 is seeing incorrect results from MultiVector multiply due to an error in KokkosBlas::gemm.

KokkosBlas::gemm exits early when the input A has no entries. See https://github.com/kokkos/kokkos-kernels/blob/00189c0be23a70979aeaa162f0abd4c0e4d1c479/src/blas/KokkosBlas3_gemm.hpp#L142

But it exits before multiplying C by beta. If C is not empty, gemm should multiply C by beta before the early exit.

This use case arises when A has zero entries on some processor, and C is locally replicated on all processors. On the empty processor, C's values do not get multiplied by beta as they should.

kddevin avatar Aug 26 '21 22:08 kddevin

@ndellingwood @brian-kelley Looks like Karen suggested a fix already (commented out early exit). Can you make sure this is the final fix and push it into repo here (and Trilinos) if it is not pushed.

srajama1 avatar Aug 31 '21 15:08 srajama1

I proposed a fix that keeps some quick returns based on input dimensions but it hopefully encompasses the corner case of empty A/B with non-empty C matrix better.

lucbv avatar Aug 31 '21 17:08 lucbv

This was fixed by PR #1091 , it now needs to be ported to Trilinos.

lucbv avatar Sep 09 '21 23:09 lucbv

@lucbv What is the target date for porting the fix to Trilinos? Once it is ported, we can merge my reproducer https://github.com/trilinos/Trilinos/pull/9819 . Thanks.

kddevin avatar Oct 15 '21 19:10 kddevin

I was waiting a bit to see if the Kokkos/Kokkos Kernels release would happen quickly and would automatically take care of this. Let me check tomorrow and if it's not likely to happen this week I will create a Trilinos PR to fix the issue.

lucbv avatar Oct 17 '21 19:10 lucbv