kokkos icon indicating copy to clipboard operation
kokkos copied to clipboard

View Refactor: Introduce new View implementation

Open crtrott opened this issue 1 year ago • 2 comments

This introduces the new View implementation derived from BasicView.

The new View implementation can be disabled via -DKokkos_ENABLE_IMPL_VIEW_LEGACY=ON. This option is also set to on if -DKokkos_ENABLE_IMLP_MDSPAN=OFF is set.

Restrictions:

  • New View implementation does not support old specialization mechanism used by Sacado for example
  • ViewHooks are not supported
  • CUDA ldg loads are not yet supported (needs special accessor) - may impact performance is not a breaking change
  • On Windows legacy view is enabled by default

Known Changed behavior:

  • use_count() is always 0 for unmanaged (runtime or compile-time) Views.

Open questions:

  • should to_mdspan() return unmanaged mdspan or pass on the reference counted accessor?
  • convertibility of managed to unmanaged View/mdspan

crtrott avatar Oct 10 '24 18:10 crtrott

There's a typo in the description, IMLP rather than IMPL. Hopefully that's not repeated in the code anywhere

PhilMiller avatar Oct 11 '24 14:10 PhilMiller

Some builds are failing because we need to update the hash of desul, we require the changes that were merged into desul in https://github.com/desul/desul/pull/129

nmm0 avatar Oct 17 '24 21:10 nmm0

Retest this please

crtrott avatar Dec 03 '24 20:12 crtrott

Retest this please!

crtrott avatar Dec 18 '24 00:12 crtrott

Retest this please!

crtrott avatar Dec 18 '24 00:12 crtrott

Retest this please!

crtrott avatar Dec 18 '24 00:12 crtrott

Retest this please

nmm0 avatar Jan 15 '25 16:01 nmm0

@crtrott Starting to explore some compiler/legacy-off combos, the following fails

LEGACY=OFF

Setup 1: gcc/12.2.0  cuda/12.0.0
Setup 2: gcc/11.4    cuda/12.4.0   
Setup 3: gcc/9.3.0   cuda/11.8.0 (w/ C++17)

CMake args:
  -DCMAKE_CXX_COMPILER=$KOKKOS_DIR/bin/nvcc_wrapper 
  -DCMAKE_CXX_FLAGS=-Werror 
  -DCMAKE_CXX_STANDARD=20 
  -DKokkos_ENABLE_OPENMP=OFF 
  -DKokkos_ENABLE_SERIAL=ON 
  -DKokkos_ENABLE_CUDA=ON 
  -DKokkos_ENABLE_TESTS=ON 
  -DCMAKE_BUILD_TYPE:STRING=Debug
  -DKokkos_ENABLE_DEPRECATED_CODE_4=OFF 
  -DKokkos_ENABLE_DEPRECATION_WARNINGS=ON 
  -DKokkos_ENABLE_IMPL_VIEW_LEGACY=OFF 

Build fail

/home/tccleve/Kokkos/kokkos/core/unit_test/TestViewTypedefs.cpp(164): error: static assertion failed
          detected during:
            instantiation of "bool <unnamed>::test_view_typedefs_impl<ViewType,ViewTraitsType,DataType,Layout,Space,MemoryTraitsType,HostMirrorSpace,ValueType,ReferenceType>() [with ViewType=Kokkos::View<int>, ViewTraitsType=Kokkos::ViewTraits<int>, DataType=int, Layout=<unnamed>::TestInt::layout_type, Space=<unnamed>::TestInt::space, MemoryTraitsType=<unnamed>::TestInt::memory_traits, HostMirrorSpace=<unnamed>::TestInt::host_mirror_space, ValueType=int, ReferenceType=int &]" 
(184): here
            instantiation of "bool <unnamed>::test_view_typedefs<L,S,M,HostMirrorSpace,ValueType,ReferenceType,T,ViewArgs...>(<unnamed>::ViewParams<T, ViewArgs...>) [with L=<unnamed>::TestInt::layout_type, S=<unnamed>::TestInt::space, M=<unnamed>::TestInt::memory_traits, HostMirrorSpace=<unnamed>::TestInt::host_mirror_space, ValueType=int, ReferenceType=int &, T=int, ViewArgs=<>]" 

I'm guessing this is just an issue independent of compiler. Legacy ON worked in all cases.

tcclevenger avatar Feb 07 '25 03:02 tcclevenger

In the GH200 HPSF build with CUDA 12.6.1

In member function 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_common(const FunctorType&, BlockSizeCallable&&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; BlockSizeCallable = int (&)(const Kokkos::Impl::CudaInternal*, const cudaFuncAttributes&, const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >&, long unsigned int, long unsigned int, long unsigned int); Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]',
    inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_max(const FunctorType&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:377:48,
    inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::team_size_max(const FunctorType&, const Kokkos::ParallelReduceTag&) const [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; Properties = {Kokkos::Schedule<Kokkos::Static>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:123:247,
    inlined from 'Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::TestTeamPolicy(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:35:175,
    inlined from 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16:
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:360:78: error: 'functor' may be used uninitialized [-Werror=maybe-uninitialized]
  360 |     const int block_size = std::forward<BlockSizeCallable>(block_size_callable)(
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~                                                                                                                                                                                                                                                                   
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp: In static member function 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Static>]':
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp:183:1: note: by argument 3 of type 'const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >&' to 'int Kokkos::Impl::cuda_get_max_block_size(const CudaInternal*, const cudaFuncAttributes&, const FunctorType&, size_t, size_t, size_t) [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Static> >; LaunchBounds = Kokkos::LaunchBounds<>]' declared here
  183 | int cuda_get_max_block_size(const CudaInternal* cuda_instance,
      | ^~~~~~~~~~~~~~~~~~~~~~~
/builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16: note: 'functor' declared here
  141 |       TestTeamPolicy functor(league_size);
      |                ^~~~~~~
In member function 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_common(const FunctorType&, BlockSizeCallable&&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; BlockSizeCallable = int (&)(const Kokkos::Impl::CudaInternal*, const cudaFuncAttributes&, const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >&, long unsigned int, long unsigned int, long unsigned int); Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]',
    inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::internal_team_size_max(const FunctorType&) const [with ClosureType = Kokkos::Impl::ParallelReduce<Kokkos::Impl::CombinedFunctorReducer<Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, Kokkos::Impl::FunctorAnalysis<Kokkos::Impl::FunctorPatternInterface::REDUCE, Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >, void>::Reducer, void>, Kokkos::TeamPolicy<Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda>, Kokkos::Cuda>; FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:377:48,
    inlined from 'int Kokkos::Impl::TeamPolicyInternal<Kokkos::Cuda, Properties ...>::team_size_max(const FunctorType&, const Kokkos::ParallelReduceTag&) const [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; Properties = {Kokkos::Schedule<Kokkos::Dynamic>, Kokkos::Cuda}]' at /builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:123:247,
    inlined from 'Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::TestTeamPolicy(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:35:175,
    inlined from 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]' at /builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16:
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp:360:78: error: 'functor' may be used uninitialized [-Werror=maybe-uninitialized]
  360 |     const int block_size = std::forward<BlockSizeCallable>(block_size_callable)(
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~                                                                                                                                                                                                                                                                   
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp: In static member function 'static void Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<ExecSpace, ScheduleType>::test_for(size_t) [with ExecSpace = Kokkos::Cuda; ScheduleType = Kokkos::Schedule<Kokkos::Dynamic>]':
/builds/kokkos/kokkos/core/src/Cuda/Kokkos_Cuda_BlockSize_Deduction.hpp:183:1: note: by argument 3 of type 'const Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >&' to 'int Kokkos::Impl::cuda_get_max_block_size(const CudaInternal*, const cudaFuncAttributes&, const FunctorType&, size_t, size_t, size_t) [with FunctorType = Test::_GLOBAL__N__033b18a4_22_TestCuda_TeamBasic_cpp_6cbc7a56_4764::TestTeamPolicy<Kokkos::Cuda, Kokkos::Schedule<Kokkos::Dynamic> >; LaunchBounds = Kokkos::LaunchBounds<>]' declared here
  183 | int cuda_get_max_block_size(const CudaInternal* cuda_instance,
      | ^~~~~~~~~~~~~~~~~~~~~~~
/builds/kokkos/kokkos/core/unit_test/TestTeam.hpp:141:16: note: 'functor' declared here
  141 |       TestTeamPolicy functor(league_size);
      |                ^~~~~~~

dalg24 avatar Apr 04 '25 11:04 dalg24

Retest this please.

crtrott avatar Apr 07 '25 16:04 crtrott

This definitely needs a changelog 4.7 entry

ndellingwood avatar Apr 16 '25 02:04 ndellingwood