cuda-python icon indicating copy to clipboard operation
cuda-python copied to clipboard

Enable IPC events and update tests to test stream-ordered IPC

Open leofang opened this issue 5 months ago • 3 comments

We probably don't have time to cover it in cuda.core 0.4.0, but we need to at least confirm that a future inclusion does not break the new IPC APIs. If so, this is a blocking issue that must be resolved before code freeze.

leofang avatar Sep 27 '25 02:09 leofang

@Andy-Jost should prove me wrong, but I believe we already kept a door for this.

  • In the exporting process:
    • create an Event with the support_ipc flag set
    • call Event.to_ipc_descriptor() to have something that can be serialized and sent over to child
  • In the importing process:
    • call a new Event.from_ipc_descriptor() constructor to reconstruct an Event object

The to/from methods are named in line with what we have for Buffer that's being modified in #1020.

Andy, please confirm if my understanding is correct, and then I'll move this task to the next milestone/release.

leofang avatar Oct 03 '25 01:10 leofang

Discussed with Andy offline:

  • we agreed the above understanding is correct and there is no breaking change needed
  • we will change the event option from support_ipc to ipc_enabled to ensure consistent UX when we work on this task, while it might seem like a breaking change, it is not because it has never worked so far: https://github.com/NVIDIA/cuda-python/blob/b09d7ed1f92ed89764cdb338759a18fc64f2032a/cuda_core/cuda/core/experimental/_event.pyx#L110-L111

Pushing this to the next cycle.

leofang avatar Oct 04 '25 15:10 leofang

It has come to my attention (https://github.com/NVIDIA/cuda-python/pull/1209#issuecomment-3482693042) that our IPC tests still have room for (performance) improvement. For example, buffer comparison can be run asynchronously and on device, instead of synchronously and on host. I'll work on this.

leofang avatar Nov 03 '25 21:11 leofang