cudaKDTree
cudaKDTree copied to clipboard
cub::DeviceRadixSort or thrust::sort
In my previous experience, it has been shown that if sort is called multiple times in a short period of time, the latency of the cube library is much shorter than that of the thrust library. You can choose between cube:: DeviceRadixSort or cube:: DeviceSegmentedRadixSort. I wonder if you are interested in trying them out, and I look forward to your reply.