kokkos-kernels icon indicating copy to clipboard operation
kokkos-kernels copied to clipboard

Kokkos::SerialSVD hang

Open vasylivy opened this issue 11 months ago • 2 comments

Hi,

Reporting a bug w/ kokkos 4.5

TEST(KokkosSerialSVD, does_not_solve)
{
  Kokkos::View<double[3][2], Kokkos::HostSpace> A(Kokkos::ViewAllocateWithoutInitializing("A"));
  Kokkos::View<double[3][3], Kokkos::HostSpace> U(Kokkos::ViewAllocateWithoutInitializing("U"));
  Kokkos::View<double[2][2], Kokkos::HostSpace> Vt(Kokkos::ViewAllocateWithoutInitializing("Vt"));
  Kokkos::View<double[2], Kokkos::HostSpace> S(Kokkos::ViewAllocateWithoutInitializing("S"));
  Kokkos::View<double[3], Kokkos::HostSpace> work(Kokkos::ViewAllocateWithoutInitializing("work"));

  A(0, 0) = -1.6175067619642277e-05;
  A(1, 0) = -1.6175067619642270e-05;
  A(2, 0) = 3.0662409276442540e-21;

  A(0, 1) = 1.6175067619642277e-05;
  A(1, 1) = -1.6175067619642277e-05;
  A(2, 1) = 2.3002860307475551e-21;

  KokkosBatched::SerialSVD::invoke(KokkosBatched::SVD_USV_Tag{}, A, U, S, Vt, work);
}

This appears to never solve if a tol is not specified, otherwise it does solve correctly. Stack trace points tovoid KokkosBatched::SerialSVDInternal::svdStep<double>(...)

reproduced the behavior w/ clang-14.0.6 and gcc-10.3.0.

Could we also document the expected size for work in the invoke call?

Thanks,

Yaro

vasylivy avatar Mar 21 '25 13:03 vasylivy

I am testing this matrix in PR #2576 my local test worked fine, let's see if some of the CI builds fail on it. We did put a patch in develop to compute the shift in a more stable fashion which could just help with your use case?

lucbv avatar Mar 26 '25 17:03 lucbv

It turns out that the fix for the previous issue like this, #2345, was in 4.5 already. But yes, @lucbv did some more work on SVD after that so hopefully the current develop has this already fixed.

brian-kelley avatar Mar 26 '25 20:03 brian-kelley