TPLs: some code paths do no lead to the expected kernel due to incomplete view [type] inspection (like `cublasDdot`)
The following code seems to account for only a subset of the valid conditions: https://github.com/kokkos/kokkos-kernels/blob/9bca19c85b88aeca97209ec7cde858447e16696c/blas/tpls/KokkosBlas1_dot_tpl_spec_decl.hpp#L132-L140
Indeed, both
Kokkos::View<double*, Kokkos::Cuda, Kokkos::LayoutLeft>
and
Kokkos::View<double*, Kokkos::Cuda, Kokkos::LayoutRight>
should map to cublasDdot since the underlying memory is the same.
Stated in other words, the condition should probably read
Is the view layout with unit stride ?
@romintomasetti Does it fail to call cublas if you pass in LayoutRight 1-D vectors? Before we get to this point in the code, we "unify" layouts to the preferred layout (Left) if possible.
from KokkosBlas1_dot.hpp
66 using XVector_Internal = Kokkos::View<typename XVector::const_value_type*,
67 typename KokkosKernels::Impl::GetUnifiedLayout<XVector>::array_layout,
68 typename XVector::device_type, Kokkos::MemoryTraits<Kokkos::Unmanaged>>;
69 using YVector_Internal = Kokkos::View<typename YVector::const_value_type*,
70 typename KokkosKernels::Impl::GetUnifiedLayout<YVector>::array_layout,
71 typename YVector::device_type, Kokkos::MemoryTraits<Kokkos::Unmanaged>>;
For example, if XVector::rank == 1, then whether it's LayoutLeft or LayoutRight, GetUnifiedLayout<XVector>::array_layout will always be LayoutLeft. A few lines down we convert X and Y to be XVector_Internal/YVector_Internal before calling the implementation. This way the TPL specialization should match.
(If XVector is LayoutStride, then GetUnifiedLayout<XVector>::array_layout will also be LayoutStride)