kokkos-kernels icon indicating copy to clipboard operation
kokkos-kernels copied to clipboard

Wrong results from scal and axpy when input type promotion is involved

Open Char-Aznable opened this issue 3 years ago • 6 comments

This code gives the wrong result for doing a KokkosBlas::scal and ::axpy when the input has different type than the output view:

#include <iostream>
#include <Kokkos_Core.hpp>
#include <KokkosBlas1_sum.hpp>
#include <KokkosBlas1_scal.hpp>
#include <KokkosBlas1_axpby.hpp>

using ExecutionSpace = Kokkos::DefaultExecutionSpace;
using MemorySpace = ExecutionSpace::memory_space;
using Device = Kokkos::Device<ExecutionSpace,MemorySpace>;


int main(int argc, char* argv[]) {
  Kokkos::initialize(argc,argv); {

  Kokkos::View<float*, Device> a("a", 3), c("c", 3);
  Kokkos::View<int*, Device> b("b", 3);
  Kokkos::deep_copy(ExecutionSpace{}, b, 4);
  const auto s = KokkosBlas::sum(b);
  const float n = float(1) / s;
  KokkosBlas::scal(a, n, b);
  KokkosBlas::axpy(n, b, c);

  Kokkos::fence();

  auto aH = Kokkos::create_mirror_view_and_copy(Kokkos::DefaultHostExecutionSpace{}, a);
  auto cH = Kokkos::create_mirror_view_and_copy(Kokkos::DefaultHostExecutionSpace{}, c);
  std::cout << "sum = " << s << " norm = " << n << "\n";
  std::cout << " a = \n";
  for(int i = 0; i < aH.extent(0); ++i) {
    std::cout << aH(i) << "\n";
  }
  std::cout << " c = \n";
  for(int i = 0; i < cH.extent(0); ++i) {
    std::cout << cH(i) << "\n";
  }

  } Kokkos::finalize();
}

Here both a and c ends up being zeros unless the type of b is changed from int to float.

Char-Aznable avatar Dec 17 '21 00:12 Char-Aznable

@Char-Aznable thanks for reporting this issue, let me tag a few developers here @vqd8a @e10harvey they might have some ideas. We will have a look but maybe not before January.

lucbv avatar Dec 20 '21 16:12 lucbv

@Char-Aznable It looks to me in the KokkosBlas::scal and KokkosBlas::axpy, the scalar type of a are internally unified with the value_type of the input view X (a is n and X is b, respectively, in your code). When the value_type of b is int, the scalar type of n is internally set to int too. So ,there must be a cast from float to int which makes n be zero.

I think we can fix this issue by unifying the scalar type of a with the value_type of the output view R (for KokkosBlas::scal) or with the value_type of the output view Y (for KokkosBlas::axpy) instead. But I am not sure if this breaks anything else.

@e10harvey, Any thoughts?

vqd8a avatar Dec 28 '21 09:12 vqd8a

Can we do something like std::is_constructible<R::value_type, decltype(std::declval<LHS::value_type>() * std::declval<RHS::value_type>())>::value to assert the input output type compatibility so to avoid having to explicitly type cast?

Char-Aznable avatar Dec 28 '21 22:12 Char-Aznable

Implicit type casting should always upcast to float in this case. Where are we explicitly down casting in scal or axpy?

e10harvey avatar Jan 04 '22 18:01 e10harvey

We should set ourselves a standard that we document on the wiki or somewhere else that spells the expected behavior when we are dealing with mixed precision. On top of adding clarity for users it would provide developers a clear guideline on how to handle this.

lucbv avatar Jan 04 '22 19:01 lucbv

Implicit type casting should always upcast to float in this case. Where are we explicitly down casting in scal or axpy?

@e10harvey , I think they are in here: https://github.com/kokkos/kokkos-kernels/blob/6c786cd6900b977e7b3b19a1f88c0c433a49cbcc/src/blas/KokkosBlas1_scal.hpp#L99-L104 https://github.com/kokkos/kokkos-kernels/blob/6c786cd6900b977e7b3b19a1f88c0c433a49cbcc/src/blas/KokkosBlas1_axpby.hpp#L105-L113

vqd8a avatar Jan 04 '22 19:01 vqd8a