vapaa
vapaa copied to clipboard
implement doubly non-contiguous reductions
I think I can do this by calling MPI_Reduce_local one element at a time. The performance will be trash but nobody should use this anyways.
The blocking case is feasible. The nonblocking case requires crazy stuff, maybe generalized requests.