ompi icon indicating copy to clipboard operation
ompi copied to clipboard

COLL framework needs work to support MPI Bigcount

Open hppritcha opened this issue 1 year ago • 4 comments
trafficstars

In the course of work done in #12226 it was discovered that, unlike the PML API, the COLL API is not ready for big count. One option would be to extend the existing table of coll methods to have entry points for large count functions. This would have the plus of only having to implement initially support for big count in the basic and maybe tuned components. It would have the downside of roughly doubling the size of mca_coll_base_comm_coll_t struct. Changing the definitions for all the existing methods to be generalized to support big count would have the down side of needing to go into every existing component and making sure their collective methods can handle MPI_Count and MPI_Aint - and if they can't be modified to support big count, disqualify their implementation of that particular collective operation.

Related to issue #9194 and PR #12226.

We do not plan to include this work in PR #12226 as its already complex enough and is really targeted at the infrastructure for generating the _c MPI API c entry points for big count and the way too long in implementation correct TS 29113 entry points for Fortran F08 (along with support for Big count on the fortran side too).

hppritcha avatar Feb 14 '24 17:02 hppritcha

Not all APIs will need to be doubled. All API handling a single count and disps (per buffer) can simply be extended to take the larger count into account. However, the APIs using arrays of counts and displacements, where the access will be more complicated (allgatherv, alltoallv, alltoallw, gatherv, reduce_scatter, scatterv, plus the non-blocking and persistent versions) will need to double.

This path is a nightmare, it will basically force us to maintain two copies of the same, already complex code, just to cope with the count and displacement type difference. I think I prefer to change the MCA coll type for counts/disps to void* and use macros to compute the right value, and then compile the code twice (once with int/int and once with MPI_Count/MPI_Aint).

bosilca avatar Feb 14 '24 18:02 bosilca

This path is a nightmare, it will basically force us to maintain two copies of the same, already complex code, just to cope with the count and displacement type difference. I think I prefer to change the MCA coll type for counts/disps to void* and use macros to compute the right value, and then compile the code twice (once with int/int and once with MPI_Count/MPI_Aint).

Okay. Would you propose adding an extra arg as well to the methods to indicate whether or not the app was invoking a big count method or small count collective op?

hppritcha avatar Feb 14 '24 18:02 hppritcha

This is also a possible approach, but not the one I was going for. My idea would have doubled the size of the coll structure and a little code in the building infrastructure, but not the size of the collective code.

bosilca avatar Feb 14 '24 19:02 bosilca

oh i see, that's kind of what @jtronge and i were thinking of doing then.

hppritcha avatar Feb 14 '24 19:02 hppritcha

closed via #12621 and #12539

hppritcha avatar Jul 30 '24 15:07 hppritcha