hdf5
hdf5 copied to clipboard
Document best practices and available controls for testing of HDF5
HDF5 should document known best practices and available knobs for controlling how and where HDF5 can be tested when building the library.
A relevant comment from @gheber on https://github.com/HDFGroup/hdf5/issues/6114, in the context of test failures occurring due to running HDF5 testing on an NFS mount:
Maybe we should give people a short guide like this:
Key knobs for keeping builds/tests off slow filesystems
- CMAKE_RUNTIME_OUTPUT_DIRECTORY, CMAKE_LIBRARY_OUTPUT_DIRECTORY, CMAKE_ARCHIVE_OUTPUT_DIRECTORY, CMAKE_TEST_OUTPUT_DIRECTORY, CMAKE_Fortran_MODULE_DIRECTORY: defaults point to
/bin (and /mod for modules) but can be set at configure time to a fast scratch path, e.g., cmake -DCMAKE_RUNTIME_OUTPUT_DIRECTORY=/scratch/hdf5/bin -DCMAKE_TEST_OUTPUT_DIRECTORY=/scratch/hdf5/tests .. (config/HDFMacros.cmake:41-88). CMAKE_INSTALL_PREFIX is also set here; override it if you want the install tree on a different volume. - HDF5_PREFIX: prepends a path to every serial test file; great for moving serial test I/O to a fast disk (test/h5test.c:50-54). Example: export HDF5_PREFIX=/scratch/$USER/hdf5_serial.
- HDF5_PARAPREFIX: same idea for MPI/parallel tests; critical to steer MPI-IO away from NFS/home (test/ h5test.c:56-61,385-415; release_docs/README_HPC.md:302-334). Example: export HDF5_PARAPREFIX=/scratch/ $USER/hdf5_par.
- HDF5_DRIVER / HDF5_TEST_DRIVER: selects the VFD for tests (test/h5test.c:42-45,2113). Setting HDF5_DRIVER=core keeps test files in memory (with an optional backing store), which can avoid slow storage timeouts; other values like sec2, split, multi, and family change how files are laid out.
- HDF5_NOCLEANUP: when set, test files are left on disk (useful for debugging, but can fill slow disks) (test/h5test.c:370-383).
- Test runtime controls to reduce exposure to slow disks: HDF5TestExpress (or the doc alias HDF5_TEST_EXPRESS) picks quick vs exhaustive suites (test/h5test.c:910-923; release_docs/README_HPC.md:335- 343), and -DDART_TESTING_TIMEOUT=... raises the default 1200s test timeout if a slow FS is unavoidable (release_docs/README_HPC.md:344-350).
Suggested use on slow filesystems: configure with runtime/archive/test output dirs on a fast scratch volume; export HDF5_PREFIX/HDF5_PARAPREFIX accordingly; optionally use HDF5_DRIVER=core for tests and a higher DART_TESTING_TIMEOUT or lower express level if still hitting timeouts.