hdf5 icon indicating copy to clipboard operation
hdf5 copied to clipboard

Enable data sieving for chunks that can't be cached

Open jhendersonHDF opened this issue 4 weeks ago • 1 comments

Fixed an issue that prevented use of a data sieve buffer for I/O on dataset chunks when those chunks couldn't be cached by the library. This issue could result in worst-case behavior of I/O on a single data element at a time when chunks are non-contiguous with respect to memory layout.

Added a test to attempt to catch performance regressions in I/O on dataset chunks that are non-contiguous with respect to memory layout

Updated the External File List logic to set the data sieve buffer size to the smaller of the dataset size and the size set in the FAPL, similar to the logic elsewhere in the library


[!IMPORTANT] Re-enable data sieve buffer for non-cached dataset chunks to improve I/O performance and add a test for performance regressions.

  • Behavior:
    • Re-enable data sieve buffer for I/O on non-cached dataset chunks in H5D__chunk_read() and H5D__chunk_write() in H5Dchunk.c.
    • Update H5D__efl_construct() in H5Defl.c to set sieve buffer size to the smaller of dataset size and FAPL size.
    • Add test chunk_non_contig_mem_io in io_perf.c to catch performance regressions for non-contiguous memory layout chunks.
  • Misc:
    • Update CHANGELOG.md to document the performance fix.
    • Add io_perf to H5_EXPRESS_TESTS in CMakeLists.txt.

This description was created by Ellipsis for cb2384235fa9d832a1e13b33d10e1633c9673fa5. You can customize this summary. It will automatically update as commits are pushed.

jhendersonHDF avatar Dec 15 '25 23:12 jhendersonHDF