Greg Mackey

Results 9 comments of Greg Mackey

What I've done before is written a class that is templated on the ExecutionSpace. The default version's constructor calls std::sort(). The specialization for Cuda calls thrust::sort(). You can use the...

Oh, I forgot to mention the obvious. You can now call this in a parallel loop like this. Umm, I guess this would really sorting a view of views instead...

@mhoemmen Oh, you're right. I combined code from two places for my example. thrust::sort is a parallel sort. It shouldn't be used inside a parallel loop. One of the pieces...

@tawiesn I'm not going to answer your question, since if you look at my above post, thrust::sort is not what you want. What I will do is a better temporary...

@tawiesn Forgot to mention that you can use this inside a parallel loop. It will work on CPU and GPU.

@mhoemmen Yeah, the only issue is adding KOKKOS_FUNCTION in front of the algorithm. I agree with your statement about KOKKOS_INLINE_FUNCTION. I only used it for the wrapper function in the...

In general, would it be useful to have sorting algorithms in Kokkos? How about other algorithms from the std algorithm header? I've got a bunch of these implemented in MTGL...

@mhoemmen What we really need is for NVIDIA to implement a CUDA-aware version of the std algorithm header! :)

@crtrott Sounds good. As I get time, I will add things to Kokkos algorithms.