bug: crashes with nonblocking collectives and isend/irecv
Originally by robl on 2016-10-04 16:34:12 -0500
The code HXHIM (formerly known as MDHIM) sometimes (at our urging) tries to use MPI to communicate between entities. It does not go well.
That is, we implemented a simple MDHIM rpc loop in MPI and MARGO in a child thread and in main thread tested a bunch of MPI calls. We ensured that we found spots where the MPI child thread interfered with the main thread. And then we re-implemented the RPC stuff with MARGO [an HPC-oritented RPC framwork based on Mercury and Argobots] and made sure that worked. > It did!
in MPICH the implementation crashes on any collective combined with MPI_isend/irecv.
Originally by robl on 2016-10-04 16:38:02 -0500
Attachment added: margo_mpi_test[1].tgz (8.0 KiB)
test case for RPC-oriented workload
The attached test case targets an older version of margo/mercury. I'll have to update it to our latest API