Joseph Schuchart

Results 262 comments of Joseph Schuchart

@AboorvaDevarajan Can you please backport this fix to the release branches?

There is a Lustre filesystem available but I am not using it in my runs.

AFAICS, all software installation directories and my home (from where I built Open MPI and where the installation is located) are NFS. I couldn't find any linked library on a...

I tried both. The good news is: the SIGBUS disappears. The bad news is that runs tend to hang with both `^xpmem` and `^sm` at the end of the run...

Here is an example of how this issue manifests: in `mca_pml_ob1_recv_req_start` these three assignments: ```C /* init/re-init the request */ req->req_lock = 0; req->req_pipeline_depth = 0; req->req_bytes_received = 0; ```...

I put together some code examples on [godbolt](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1AB9U8lJL6yAngGVG6AMKpaAVxYMQAJg2kHAGTwGTAA5dwAjTGJvAGZSAAdUBUJbBmc3D29fROSbAUDgsJZI6K84y0xrVKECJmICdPdPHwtMKzyGGrqCAtCIqNiLWvrGzJaFYZ6gvuKBsoBKC1RXYmR2DgBSMqDkNywAag2Yxwn0JiIWPGQAOgQj7A2NAEEtmJ29zEPj06CCW/vHi9tgxdq4DkcTgR0OFUC5/jEHs9AZhVAQogx9jCXPtXMkjMYCAhiJgmOgFEcAEKAwEEACe8UwWCo%2B2MTwuV32v3251Ql2Qxl%2BBMpNPpjMwzImxFc1kOAHYqc99kruez%2BYKCNzhUjZQARfaS6UEYwAd0ICGMPL5AsEQpiCpezzpDKZ%2BwAbi5znRPlzXZartajQQtWzRS6DTKNvLAcq3X61TaNUxg5G9eGjabCcZ3QYbPQA7b7YCuVRSegLaqIHH8xqAFRMUicwRu%2BZy%2B0xvDMiC4oLAAlEklkluRtsx5XEggrDFVqiYAjIc2l4wo%2BK0K6ESsN10NtgsEi04wkLDEYzEgyqRnzYMxlP7NpKVvR0dKrkEFjxL56utXp/7OuHLxUjEequt%2BT7jpO%2ByvvEoFyjq1K6tSzzung6D6rOTAQGmJpmuWvL%2Buqv4KEOUaKsqCgALT3EwH77BoyYIUizzFou2aevQEC%2Bqq1a/vWjYaq6xEjs%2BnbdvihLEqSREPqRYGzhB06zvOFroGWy6rsg64QKyqp8TW8y8Vu%2Bw7nuB7EEeJ5tEw57oJedqPkqN53p8w72aOL5vjRX52TJo5/lsgHATBMbgcQGJQTBKbwXBjFPMhqFKAQrqYQQUrWNhmasbmmDcTWUkuT5FFUTRdHeS8DFPBwiy0JwACsvCeBwWikKgnCOPqyyrM5ZQ8KQBCaJViwANYgJIkjXDEACcABs02SGUE1eFwAAcXiyvonCSPV/XNZwvAKCAvh9Y1lWkHAsBIGgb5emQFAQJd8TXSguyGL2KWuAwg18HQaLEPtEDhNt4RBHUtKcD1l1sIIADyDC0KDx2kFgLAveICP4MSVSupg%2B0IyilSuGiYO8L8bTbau4TECDzhYNtKV4CwROLFQBjAAoABqeCYMaUMMg1PX8IIIhiOwUgyIIigqOoCO6Fw%2BgvSApjGOY5P7ZAiyoPEHQ4%2BRUNeHtbSVB09gME4LhNHoATTEUJR6DkKQCKMniy3bHS9NbAyyxUVQCF0Ixm5knsG97nSTG7/TRJ7kyO3oEzdGHswR4sCgdWsegpZg6w8FVtVbQjLUcKoS1TeRU2SPsz1GPsEBvR9LYQI4Da4IQJD/jEsv7M4V30MQrdcPMvBHVo8xDd4E3jVwE1F0tE%2ByrPNVTWt1UcJtpANU1%2Bd7QdvX9YsZ2ICAywEPEBPkPY%2BBEMQKF6ALwiiOIos3xLajbTLpDGpT8SM%2BtHB1av2351DAmx8NSoGZIXYupdy4GErtXKUtcq6dwet3Xu/dt7HWHqQYaXgprXC8EtNuC954xA0BoKaGgapxCXivNevAN4WC3oPAa389Z/zzrtNBQ9FhY1%2BqkEaQA). There are two variants: one using `_Atomic` and one using `volatile`. One function, `fadd_[atomic|volatile]` does what the opal wrappers do: check a...

I agree that this is a band-aid but I didn't feel comfortable proposing a complete overhaul. If we want to go down that road we might want to look at...

MPICH seems to be using a similar approach: https://github.com/pmodels/mpich/blob/main/src/mpl/include/mpl_atomic_c11.h (from my limited understanding of their code base)

Yes, there will be a bit of code to touch. Using GCC `__sync` builtins would help somewhat but has it's drawbacks. They don't provide much control over the memory ordering,...

Closing in favor of https://github.com/open-mpi/ompi/pull/10613