Wait/Test(some/any/all) for events ?
Individual events provided by async kernel launches have mechanisms to test/wait already, but if someone dispatches multiple kernels via forall and has multiple events related to those kernels, it doesn't seem like there is a mechanism provided (that I've located at least) to (a)synchronously poll a set of events for completion ( think MPI_Wait/Test(some/any/all) ).
For instance, launching multiple kernels to pack (workgroups might be preferable but there is an intermediate fix that needs to be made in our code to use those) information for halo exchange requires multiple events complete prior to initiating MPI sends.
I have implementations of the above that are analogous to the MPI async polling in our codebase for the above issue, porting these back into either RAJA (or CAMP, since the events returned from RAJA async foralls are camp objects) shouldn't be too difficult if people think they would be useful constructs to add.
That is sort-of correct. We do not have a way to do something like multi_wait(e1, e2, e3). The ordering semantics are such that they can be interconnected however, so a wait all could be handled by using a context and joining the events on it. The "wait any" type case is interesting though, perhaps they would be nice conveniences. Do you have a pointer to what the interface looks like?