Thomas Gillis
Thomas Gillis
Yes, could you rerun with `FI_LOG_LEVEL=Debug`? And provide the whole log?
it's hard to understand what is going on without looking at the code but here is what I can guess ``` 1695433619::core:mr:ofi_monitor_subscribe():474 Failed (ret = -14) to monitor addr=0x3a97000 len=4096...
It should be fine. The error you get is generated when reading the actual buffer of data :-) the way rendezvous works usually is along the lines of (1) register...
both OpenMPI and MPICH (and crayMPI) are going to use the provided `libfabric` library, so you should be able to compile and run if you think it would help
so typically (1) and (2) will happen when calling `MPI_Isend` (or `MPI_Start` if persistent send/recv), but the (3) is going to happen during `MPI_Wait`. Once returned from `MPI_Wait`, the buffer...
test:mpich/custom netmod:ch4:ofi testlist:part
test:mpich/custom netmod:ch4:ofi testlist:part
test:mpich/custom netmod:ch4:ofi testlist:part
test:mpich/custom netmod:ch4:ofi testlist:part
test:mpich/custom netmod:ch4:ofi testlist:part