View Refactor: Introduce new View implementation
This introduces the new View implementation derived from BasicView.
The new View implementation can be disabled via -DKokkos_ENABLE_IMPL_VIEW_LEGACY=ON. This option is also set to on if -DKokkos_ENABLE_IMLP_MDSPAN=OFF is set.
Restrictions:
- New View implementation does not support old specialization mechanism used by Sacado for example
- ViewHooks are not supported
- CUDA ldg loads are not yet supported (needs special accessor) - may impact performance is not a breaking change
- On Windows legacy view is enabled by default
Known Changed behavior:
-
use_count()is always 0 for unmanaged (runtime or compile-time) Views.
Open questions:
- should
to_mdspan()return unmanaged mdspan or pass on the reference counted accessor? - convertibility of managed to unmanaged View/mdspan
There's a typo in the description, IMLP rather than IMPL. Hopefully that's not repeated in the code anywhere
Some builds are failing because we need to update the hash of desul, we require the changes that were merged into desul in https://github.com/desul/desul/pull/129
Retest this please
Retest this please!
Retest this please!
Retest this please!
Retest this please
@crtrott Starting to explore some compiler/legacy-off combos, the following fails
LEGACY=OFF
Setup 1: gcc/12.2.0 cuda/12.0.0
Setup 2: gcc/11.4 cuda/12.4.0
Setup 3: gcc/9.3.0 cuda/11.8.0 (w/ C++17)
CMake args:
-DCMAKE_CXX_COMPILER=$KOKKOS_DIR/bin/nvcc_wrapper
-DCMAKE_CXX_FLAGS=-Werror
-DCMAKE_CXX_STANDARD=20
-DKokkos_ENABLE_OPENMP=OFF
-DKokkos_ENABLE_SERIAL=ON
-DKokkos_ENABLE_CUDA=ON
-DKokkos_ENABLE_TESTS=ON
-DCMAKE_BUILD_TYPE:STRING=Debug
-DKokkos_ENABLE_DEPRECATED_CODE_4=OFF
-DKokkos_ENABLE_DEPRECATION_WARNINGS=ON
-DKokkos_ENABLE_IMPL_VIEW_LEGACY=OFF
Build fail
/home/tccleve/Kokkos/kokkos/core/unit_test/TestViewTypedefs.cpp(164): error: static assertion failed
detected during:
instantiation of "bool <unnamed>::test_view_typedefs_impl<ViewType,ViewTraitsType,DataType,Layout,Space,MemoryTraitsType,HostMirrorSpace,ValueType,ReferenceType>() [with ViewType=Kokkos::View<int>, ViewTraitsType=Kokkos::ViewTraits<int>, DataType=int, Layout=<unnamed>::TestInt::layout_type, Space=<unnamed>::TestInt::space, MemoryTraitsType=<unnamed>::TestInt::memory_traits, HostMirrorSpace=<unnamed>::TestInt::host_mirror_space, ValueType=int, ReferenceType=int &]"
(184): here
instantiation of "bool <unnamed>::test_view_typedefs<L,S,M,HostMirrorSpace,ValueType,ReferenceType,T,ViewArgs...>(<unnamed>::ViewParams<T, ViewArgs...>) [with L=<unnamed>::TestInt::layout_type, S=<unnamed>::TestInt::space, M=<unnamed>::TestInt::memory_traits, HostMirrorSpace=<unnamed>::TestInt::host_mirror_space, ValueType=int, ReferenceType=int &, T=int, ViewArgs=<>]"
I'm guessing this is just an issue independent of compiler. Legacy ON worked in all cases.
In the GH200 HPSF build with CUDA 12.6.1
In member function 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_common(const FunctorType&, BlockSizeCallable&&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; BlockSizeCallable = int (&)(const Kokkos::Impl::CudaInternal*, const cudaFuncAttributes&, const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >&, long unsigned int, long unsigned int, long unsigned int); Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]',
inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_max(const FunctorType&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:377:48,
inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::team_size_max(const FunctorType&, const Kokkos::ParallelReduceTag&) const [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:123:247,
inlined from 'Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::TestTeamPolicy(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:35:175,
inlined from 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16:
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:360:78: error: 'functor' may be used uninitialized [-Werror=maybe-uninitialized]
360 | const int block_size = std::forward<BlockSizeCallable>(block_size_callable)(
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp: In static member function 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]':
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp:183:1: note: by argument 3 of type 'const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >&' to 'int Kokkos::Impl::cuda_get_max_block_size(const CudaInternal*, const cudaFuncAttributes&, const FunctorType&, size_t, size_t, size_t) [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; LaunchBounds = Kokkos::LaunchBounds<>]' declared here
183 | int cuda_get_max_block_size(const CudaInternal* cuda_instance,
| ^~~~~~~~~~~~~~~~~~~~~~~
/builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16: note: 'functor' declared here
141 | TestTeamPolicy functor(league_size);
| ^~~~~~~
In member function 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_common(const FunctorType&, BlockSizeCallable&&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; BlockSizeCallable = int (&)(const Kokkos::Impl::CudaInternal*, const cudaFuncAttributes&, const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >&, long unsigned int, long unsigned int, long unsigned int); Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]',
inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_max(const FunctorType&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:377:48,
inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::team_size_max(const FunctorType&, const Kokkos::ParallelReduceTag&) const [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:123:247,
inlined from 'Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::TestTeamPolicy(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:35:175,
inlined from 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16:
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:360:78: error: 'functor' may be used uninitialized [-Werror=maybe-uninitialized]
360 | const int block_size = std::forward<BlockSizeCallable>(block_size_callable)(
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp: In static member function 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]':
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp:183:1: note: by argument 3 of type 'const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >&' to 'int Kokkos::Impl::cuda_get_max_block_size(const CudaInternal*, const cudaFuncAttributes&, const FunctorType&, size_t, size_t, size_t) [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; LaunchBounds = Kokkos::LaunchBounds<>]' declared here
183 | int cuda_get_max_block_size(const CudaInternal* cuda_instance,
| ^~~~~~~~~~~~~~~~~~~~~~~
/builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16: note: 'functor' declared here
141 | TestTeamPolicy functor(league_size);
| ^~~~~~~
Retest this please.
This definitely needs a changelog 4.7 entry