kokkos-tools
kokkos-tools copied to clipboard
Kokkos C++ Performance Portability Programming Ecosystem: Profiling and Debugging Tools
It would be nice if there was a way to see total memory transferred between different memory spaces, specifically between GPU and CPU.
This might be a corner case. I have some tests in Trilinos that are single node only. They do not need MPI and, therefore, don't call MPI_Init in main (in...
I need to run the Kokkos kernel logger on hundreds of thousands of MPI ranks for many hours to debug an issue. I only care about the last few kernels,...
On my Mac laptop, the Kokkos memory profiling values from `getrusage` are off by a factor 1024. This is because `getrusage` uses units of *bytes* on Mac but *kilobytes* on...
Should gracefully error out instead of segfaulting, and alert the user to use dynamic linking.
The HighWater-Process(MB) metric for the CUDA memory space (file *.Cuda.memspace_usage) is bogus, it is for the host and doesn't accurately reflect the CUDA value.
When there are a lot of memory events, Kokkos memory events itself can have significant memory overhead due to storing all of the events. This should be subtracted off to...
If the code crashes while profiling, like due to out of memory, there is zero output from the memory tools, which doesn't help debugging. It would be nice if the...
Fixes https://github.com/kokkos/kokkos-tools/issues/40. By default, `SpaceTimeStack` sets `USE_MPI=1` and assumes that the application uses MPI and initializes it. As also described in the link issue, this assumption is oftentimes not true....
https://github.com/kokkos/kokkos-tools/blob/c901382c4c76b108e4e6d190e9236848dc764526/profiling/simple-kernel-timer/kp_kernel_timer.cpp#L120 It seems this name is not correct. It should say `SimpleKernelTimer loaded` or something like that. I suppose something along the following: https://github.com/kokkos/kokkos-tools/blob/c901382c4c76b108e4e6d190e9236848dc764526/profiling/memory-events/kp_memory_events.cpp#L85