Null MPI_Request for stream enqueued Waitall
If I'm not mistaken, MPIX_Waitall_enqueue() would crash if the array contains a regular MPI_REQUEST_NULL due to failed assert.
Conventionally given a list a communication tasks and array of requests, we would initialize any entry in the array to MPI_REQUEST_NULL that doesn't have a corresponding operation. Then we pass the array to regular Waitall function.
But for MPIX_Waitall_enqueue(), the MPI_REQUEST_NULL will fail the internal check since it's just a reserved address without a proper kind. And any other request objects responsible for stream enqueued communication would hang on regular MPI_Waitall().
I don't think MPICH has a MPIX_Request, is there a enqueued version MPI_REQUEST_NULL, or what we initialize idle requests to?
The requests passed to MPIX_Waitall_enqueue must be from MPIX_Isend_enqueue or MPIX_Irecv_enqueue, and they have to be all enqueued to the same gpu stream.
@hzhou but shouldn't null requests be acceptable and handled like non-enqueued Waitall? I can't find this specified in the standard, but I think null requests are accepted generally including in MPICH, and often used that way?
If every process has a request array, allocated same way and same size, the ones with less communication (like edge processes in a 2D Cartesian grid) would have null requests. It would be convenient for MPIX_Waitall_enqueue() to accept MPI_NULL_REQUESTs instead of adding different logic for each edge process in every unforeseeable circumstance
Yes, MPI_REQUEST_NULL is accepted in MPI_Waitall. The use case is for applications that may use MPI_Test before and it is inconvenient to track which requests have already been completed and become MPI_REQUEST_NULL. Note that there is no MPI_Test_enqueue. The stream enqueue semantics does not allow non-deterministic completions as with MPI_Test. Thus we believe there isn't a valid use case to pass MPI_REQUEST_NULL to MPIX_Wallall_enqueue. If it is passed to wait-enqueue, most likely it is a coding error and it is better to fail explicitly.
@AtlantaPepsi Do you have further questions regarding this issue?