implement large-count v-collectives using alltoallw and type_struct
Patrick Flick notes:
For v-collectives I've been able to use the regular MPI_Alltoallw (and avoid the neighborhood collectives). I wanted to share my approach and ask you for your opinion. In my experience this works for current versions of MPICH and OpenMPI.
As you mentioned in your paper, the MPI_Alltoallw takes integer offsets, and thus can't be used for sending larger than INT_MAX datatypes. However, what works for me is to wrap each datatype into a MPI_Type_create_struct with a single element (the type_contiguous) and specifying the required offset as the displacement (which is a MPI_Aint). Then the MPI_Alltoallw can be called with offset = 0 for all processes.
This is a great idea. It should be implemented in BigMPI.
@jeffhammond I'd gladly work on this (i.e., copy my approach over into your code base).
I've noticed you are differentiating between methods for the v-collectives (P2P, RMA, Alltoallw). Should I add a new one or change the neighborhood one to use the proposed method?
Please add a new one, possibly derived from the ALLTOALLW (renamed to NEIGHBORHOOD_ALLTOALLW I guess) case. That way it's easy to compare them and there is a work around if one of them is broken in a particular implementation (unlikely it seems, based upon our respective experiences).
I apologize that this part of the code is a mess. I should refactor but alas I think I got the one and only paper out of this project that I can get, and my day job is publish-or-perish to some extent.