Puffin icon indicating copy to clipboard operation
Puffin copied to clipboard

Bad hdf5 writing performance with OpenMPI 3.x and 4.x on distributed file systems

Open LarsHuebner-LHNW opened this issue 1 year ago • 0 comments

Full dumps are slow on distributed file systems when using recent versions of OpenMPI, that default to ompio for MPI-I/O. MPICH and MPI implementations based on MPICH (e.g. IntelMPI) use romio and do not have this issue.

setting the environment variables OMPI_MCA_io=^ompio OMPI_MCA_fs_ufs_lock_algorithm=3 or calling mpirun with --mca io ^ompio --mca fs_ufs_lock_algorithm 3 disables ompio and enables romio and fixes the slow writes.

From what I found out this seems to be a known issue for (fortran) applications with parallel hdf5 written for api v18 (See for example a note here or on this mailing list ).

Thought I should share this information with other users.

LarsHuebner-LHNW avatar Mar 19 '24 18:03 LarsHuebner-LHNW