netcdf-c
netcdf-c copied to clipboard
add testing with hdf5-1.14.0?
hdf5-1.14.0 was just released. It contains some great performance improvements for HPC systems using compression! ;-)
For more details on these HDF5 improvements see: https://www.hdfgroup.org/2022/03/parallel-compression-improvements-in-hdf5-1-13-1/
Use of the 1.14.0 release seems to resolve the recently raised issue https://github.com/Unidata/netcdf-fortran/issues/389.
So this is wonderful and we are all very happy here at NOAA. But are you guys testing with hdf5-1.14.0? Seems like it works out of the box but I'm not testing exhaustively...
Hi Ed, glad to hear HPC folk are seeing performance improvements! We aren't testing against it yet, and trying an out-of-the-box test on MacOS is returning some netCDF compilation errors (using a clang-based non-parallel build of hdf5 1.14.0), so I'll need to investigate that first. We'll get to it sooner than later, although I'm hoping to get v4.9.1 out shortly!
We are seeing two 1.14 HDF5-related issues with NetCDF parallel.
The first relates to an assert being triggered in HDF5 by the parallel NetCDF tests, https://github.com/HDFGroup/hdf5/issues/2433 If I remove the assert all the NetCDF tests pass with HDF 1.14.0.
The second relates to a hang when MPI_Info_set is set to romio_no_indep_rw "true", https://github.com/HDFGroup/hdf5/issues/2434
@brtnfld what is the assert line?
nc_create_par will fail in HDF5 with assertion: ../../src/H5Fio.c:397: H5F_shared_vector_write: Assertion `types[i] != H5FD_MEM_GHEAP' failed.
I cannot reproduce this problem on my linux workstation.
Are you using mpich or openmpi?
You are building HDF5 with --enable-parallel and netcdf-c with --enable-parallel-tests?
And you are seeing a failure in the netcdf-c parallel test? Have you tried the HDF5 tests?
mpich 4.0.2
Yes, HDF5 with --enable-parallel and
../configure --disable-byterange --enable-parallel-tests --enable-logging --prefix=${PREFIX} --enable-cdf5 --enable-netcdf-4 --enable-parallel4
It was a NetCDF-c test:
nc_create_par will fail assertion: ../../src/H5Fio.c:397: H5F_shared_vector_write: Assertion `types[i] != H5FD_MEM_GHEAP' failed.
It does not fail any HDF5 tests, but we have a reproducer and a fix at:
https://github.com/HDFGroup/hdf5/pull/2480
OK, that's a quick fix from the HDF5 team! Good work!
Meanwhile, should we also remove this assert from netCDF? Since the HDF5 fix won't come out until the next release, and we want to continue working with older versions of HDF5?
It is only 1.14.0. The older versions will not have this issue since it was only introduced in 1.13.2. What do you mean by the assert in netCDF? It was only an assert in HDF5.
OK, this is working now. I will close this issue.