oneDPL
oneDPL copied to clipboard
A one group version of merge_sort
I think its close, and there are some decent opportunities for short term improvement.
In the long term, a one group version of this could be a big improvement for small sizes of n, shrinking down to a single kernel launches from O(log(n)) kernel launches where we would be in a single or only a few workgroups anyway.
Originally posted by @danhoeflinger in https://github.com/oneapi-src/oneDPL/pull/1098#pullrequestreview-1594106537