Enable IPC events and update tests to test stream-ordered IPC
We probably don't have time to cover it in cuda.core 0.4.0, but we need to at least confirm that a future inclusion does not break the new IPC APIs. If so, this is a blocking issue that must be resolved before code freeze.
@Andy-Jost should prove me wrong, but I believe we already kept a door for this.
- In the exporting process:
- create an
Eventwith thesupport_ipcflag set - call
Event.to_ipc_descriptor()to have something that can be serialized and sent over to child
- create an
- In the importing process:
- call a new
Event.from_ipc_descriptor()constructor to reconstruct anEventobject
- call a new
The to/from methods are named in line with what we have for Buffer that's being modified in #1020.
Andy, please confirm if my understanding is correct, and then I'll move this task to the next milestone/release.
Discussed with Andy offline:
- we agreed the above understanding is correct and there is no breaking change needed
- we will change the event option from
support_ipctoipc_enabledto ensure consistent UX when we work on this task, while it might seem like a breaking change, it is not because it has never worked so far: https://github.com/NVIDIA/cuda-python/blob/b09d7ed1f92ed89764cdb338759a18fc64f2032a/cuda_core/cuda/core/experimental/_event.pyx#L110-L111
Pushing this to the next cycle.
It has come to my attention (https://github.com/NVIDIA/cuda-python/pull/1209#issuecomment-3482693042) that our IPC tests still have room for (performance) improvement. For example, buffer comparison can be run asynchronously and on device, instead of synchronously and on host. I'll work on this.