mpich
mpich copied to clipboard
ch4: shm: fix data type for recv_bytes in MPIDI_POSIX_mpi_release_gat…
The number of received bytes in release_gather_release is badly cast between int and MPI_Aint. On most arch this is not an issue, but for Big-Endian 64b arch (s390x) it ends up losing the actual value. Fix the issue but writing the whole MPI_AInt in the shm_buf instead of just an int.
This bug was found on 4.3.2 while debugging on s390x with ch4:ofi:
> mpiexec -np 4 ./file_info -fname test
Abort(476133135) on node 1 (rank 1 in comm 0): Fatal error in internal_Bcast: Other MPI error, error stack:
internal_Bcast(116)........................: MPI_Bcast(buffer=0x1004174, count=1, MPI_INT, 0, MPI_COMM_WORLD) failed
MPID_Bcast(295)............................:
MPIDI_Bcast_allcomm_composition_json(239)..:
MPIDI_Bcast_intra_composition_alpha(292)...:
MPIDI_POSIX_mpi_bcast(278).................:
MPIDI_POSIX_mpi_bcast_release_gather(127)..:
MPIDI_POSIX_mpi_release_gather_release(225): message sizes do not match across processes in the collective routine: Received 0 but expected 4
test:mpich/ch3/most test:mpich/ch4/most