easybuild-easyconfigs
easybuild-easyconfigs copied to clipboard
intel-2023b: fi_info not working as expected
During running the test jobs of MOLCAS-84 I came across that issue:
Abort(606203407) on node 10 (rank 10 in comm 0): Fatal error in PMPI_Put: Other MPI error, error stack:
PMPI_Put(160)........: MPI_Put(origin_addr=0x7ffde9f037a0, origin_count=5, MPI_LONG, target_rank=6, target_disp=50, target_count=5, MPI_LONG, win=0xe0000002) failed
MPID_Put(896)........:
MPIDI_put_safe(565)..:
MPIDI_put_unsafe(71).:
MPIDI_OFI_do_put(436): OFI rdma write immediate failed (ofi_rma.h:436:MPIDI_OFI_do_put:Invalid argument)
Given that more than one job failed, I done a bit of digging and notice this command is not working as expected:
$ fi_info | grep provider
fi_getinfo: -61
Further digging revealed it is working up to intel-2023a
and also works with intel-2024a
. So for me clearly intel-2023b
has a problem.
My hunch is the problem is outside of what EasyBuild does. I will try and do some more digging.