stumpy icon indicating copy to clipboard operation
stumpy copied to clipboard

Add Cooperative Group for GPU-STUMP

Open seanlaw opened this issue 1 year ago • 3 comments

Several years ago, we considered (see #266) adding a variant of GPU-STUMP that utilized cooperative groups and that would allow us to push the multiple kernel launches onto the device. Earlier work was concerned about:

  1. Breaking backwards compatibility
  2. Adding unnecessary complexity to the code

However, cudatoolkit support is much better now and older GPUs that lack cooperative group support are likely end-of-life (and so the above concerns are likely a thing of the pst now). Additionally, numba has moved ahead many, many versions since our last attempt. Thus, we should reconsider adding this to STUMPY. PR #266 provides some clear code for how to proceed and had demonstrated a 12% speedup, which is great!

See also the numba docs on cooperative groups

seanlaw avatar Sep 16 '24 14:09 seanlaw