Joseph Schuchart
Joseph Schuchart
Can you post the output of running your test with `--mca osc_verbose 100`, both the one with and without forcing osc/rdma? That might give a clue what is going on...
I was just looking into this (sorry for the delay). I can reproduce the error mca/rdma but I'm not sure it is expected to work under all circumstances on the...
@jotabf I tried your example with osc/ucx and get output both on shared memory and with multiple nodes in an IB network using Open MPI 4.1.1 built against UCX 1.10.0:...
@jotabf Removing the lock is trivial: move the `MPI_Win_lock` and `MPI_Win_unlock` out of the loop and use `MPI_Rget`+`MPI_Wait` inside the loop. No need to have the locks in each iteration...
@yosefe Regarding your question: why is the atomic operation needed in MPI_Rget/MPI_Get? I believe this is used to get a handle for the request to test/wait on. I added a...
@jotabf I believe this comes back down to what @yosefe said in https://github.com/open-mpi/ompi/issues/9580#issuecomment-962264701 (please correct me if I'm wrong here): your network does not support 64bit atomic operations and the...
@roystgnr Thanks for the report. I think I have a lead, will file a PR shortly.
https://github.com/open-mpi/ompi/pull/10527 should fix it (seems like a copy&paste error)
https://github.com/open-mpi/ompi/pull/10527 was ported to 5.0.x, 4.1.x, and 4.0.x. Closing this issue.
Perfect, thanks a lot!