cuda-kat
cuda-kat copied to clipboard
Add <algorithm> and <numeric> functions as thread-level primitives?
While it's rarely a great idea, for the sake of completeness, we may want to have implementations of the Add abstract <algorithm>
and <numeric>
algorithms which could be run by all threads without collaboration, each on its own data.
What do you think? Good idea? Bad idea?
See also issue #18.