Luc Berger
Luc Berger
Also adding @ndellingwood to avoid filling this twice : )
Thanks for letting me know, I will look at that after the build issues are resolved.
Hum, someone forgot to add a `KOKKOS_FUNCTION` or `KOKKOS_INLINE_FUNCTION` somewhere...
@ndellingwood yeah, I was trying to get us thru the build issues, I am not 100% sure why rdc+uvm creates run time failures and will need a bit more time...
This should now be fixed with PR #1470 being merged, also this issue seems to be a duplicate of #1413
Support for partition_spaces and separate execution space instances (GPU streams) in Kokkos Kernels.
@dialecticDolt I merged the work on this feature in PR #1131 let me know if that meets your requirements? If so we can probably close this issue, otherwise let's discuss...
@cwpearson not all the CI has been switched to c++17 (see Weaver build on GPU with cuda 10) so this will need to wait for a bit before it can...
Actually a good start would be to re-run this with finer timers, for instance using [Kokkos-tools](https://github.com/kokkos/kokkos-tools) simple-kernel-timer. You just need to build the tool and then set: `export KOKKOS_PROFILE_LIBRARY=${HOME}/kokkos-tools/profiling/simple-kernel-timer/kp_kernel_timer.so`
I can assign you, but will also look at adding you to Kokkos org and Kokkos Kernels team.
Okay, hopefully PR #1412 fixed it, let me know if you see it again, otherwise we might be able to close this.