Daniel Arndt
Daniel Arndt
I pushed a commit disabling relevant tests for SYC+Cuda.
Let's see if we can get #5332 in than we don't need this one.
What I wanted to achieve with this pull request (mostly) was that the way we promised `Legion` to initialize `Kokkos` works, i.e., the sequence of ``` Kokkos::Impl::pre_initialize() EXECUTION_SPACE::impl_initialize(); Kokkos::Impl::post_initialize(); EXECUTION_SPACE::impl_finalize();...
I haven't tried building APEX and running with it.
I can reproduce it in `APEX` and reverting https://github.com/kokkos/kokkos/pull/4682 doesn't seem to change anything. I wouldn't be surprised if all the memory leaks reported come from the singleton that is...
For me, all reported memory leaks vanish with the following patch. ```diff diff --git a/core/src/Cuda/Kokkos_Cuda_Instance.cpp b/core/src/Cuda/Kokkos_Cuda_Instance.cpp index efad2ff1c..816eb07a4 100644 --- a/core/src/Cuda/Kokkos_Cuda_Instance.cpp +++ b/core/src/Cuda/Kokkos_Cuda_Instance.cpp @@ -386,9 +386,9 @@ void CudaInternal::initialize(cudaStream_t stream,...
I don't see any memory leaks with ``` #include #include int main(int argc, char* argv[]) { Kokkos::initialize(argc, argv); { void * ptr; // cudaMalloc(&ptr, 1024); int N = argc >...
@rgayatri23 Please rebase so that this doesn't include 611 commits.
> Please comment whether this can/should be implemented for other parallel constructs with a RangePolicy. `paralle_for` was just the most relevant case, but I intended to add versions for other...
@crtrott For the benchmark in https://github.com/kokkos/kokkos/issues/5581#issue-1415470255, I see ``` Normal: 1.68759 s. Unmanaged: 0.622187 s. ``` before this pull request and ``` Normal: 1.70023 s. Unmanaged: 0.622541 s. ``` after.