Min Si
Min Si
In test `pt2pt/rqfreeb`, the last irecv request `r[4]` is intentionally freed by `MPI_Request_free` (line 111). In CH4/UCX, however, the request is completed at `MPIDI_UCX_recv_cmpl_cb` which might or might not be...
When using RMA put or get to implement the halo exchange in 2D stencil, the performance of east/west exchange is much worse than that using send/recv. Below is the performance...
PR: https://github.com/mpi-forum/mpi-standard/pull/93 Issue (has more discussion details): https://github.com/mpi-forum/mpi-issues/issues/114
Git repository of openpa might be deleted at some point as it is not used in MPICH now. We might want to find replacement for it.
To ensure full progress of OSHMPI, the user has to either enable async thread or Casper. The use of OSHMPI + Casper is not well documented.
Features needed: - [ ] set -np per test - [ ] set environment variable per test
Direct AMO mode still shows about 2x overhead than that of SOS Benchmark: osu_oshm_atomics_all2one (shmem_int_finc -> MPI FOP) ``` #SOS #direct-amo Theta/np=2 14.88 39.22 Cori/np=2 2.67 4.21 ``` The current...
With team context, OSHMPI may use separate communicator for collectives, and separate window for RMA/AMO. We want to evaluate the performance of multithreaded RMA/AMO/collectives with MPICH/VCI which internally allocates dedicated...