hdf5 icon indicating copy to clipboard operation
hdf5 copied to clipboard

Contiguous datasets in SWMR mode cannot be read with hdf5 1.14.4+

Open PeterC-DLS opened this issue 1 month ago • 2 comments

Describe the bug During the writing of a SWMR file with a contiguous dataset, when a read process opens that dataset, I get using hdf5 1.14.4-3

  File "/io/swmr-check.py", line 33, in read
    c = f[data_path]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/opt/python/cp310-cp310/lib/python3.10/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 257, in h5py.h5o.open
KeyError: 'Unable to synchronously open object (invalid dataset size, likely file corruption)'

I guess #4283 may have caused this issue.

Expected behavior In a prior release, such as 1.14.3, the dataset is opened successfully. I also encounter another issue using this release which is noted below.

Platform (please complete the following information)

  • HDF5 version 1.14.4.-3
  • OS and version RHEL 7
  • Compiler and version
  • Build system (e.g. CMake, Autotools) and version cmake
  • Any configure options you specified
  • MPI library and version (parallel HDF5) not used

Additional context A h5py script to reproduce the error:

import h5py
import numpy as np
from time import sleep

data_path = "test/data"

def write(out_name, chunk_size=12, reps=60, pause=4, chunk=False):
    d = np.arange(chunk_size, dtype=np.float64)
    print(d)
    total_shape = (chunk_size * reps,)
    with h5py.File(out_name, "w", libver="latest") as f:
        if chunk:
            f.create_dataset(data_path, dtype=d.dtype, shape=d.shape, maxshape=total_shape, chunks=d.shape, fillvalue=np.nan)
        else:
            f.create_dataset(data_path, dtype=d.dtype, shape=total_shape)
            # workaround using for a fixed shape chunked dataset
            #f.create_dataset(data_path, dtype=d.dtype, shape=total_shape, maxshape=total_shape, chunks=total_shape, fillvalue=np.nan)
        f.swmr_mode = True
        # sleep(pause)
        c = f[data_path]
        for i in range(reps):
            b = i * chunk_size
            e = b + chunk_size
            s = slice(b, e)
            print(f"{i+1}/{reps}: {s}")
            if chunk:
                c.resize((e,))
            c[s] = d
            c.flush()
            sleep(pause)

def read(in_name, reps=60, pause=4):
    with h5py.File(in_name, "r", libver="latest", swmr=True) as f:
        c = f[data_path]
        l = 0
        while True:
            c.refresh()
            l += 1
            print(l, c.shape)
            print(c[:])
            if l == reps:
                break
            sleep(pause)

def main(reps=5):
    from multiprocessing import Process
    use_chunk = False
    filename = "swmr-test.h5"
    w = Process(target=write, args=(filename,), kwargs={"reps":reps, "chunk":use_chunk})
    r = Process(target=read, args=(filename,), kwargs={"reps":reps})
    w.start()
    sleep(0.5) # wait long enough for file to be created
    r.start()
    w.join()
    r.join()

if __name__ == "__main__":
    main()

Note, also, the contents of a contiguous dataset does not read updated values during SWMR writing in 1.14.3:

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11.]
1/5: slice(0, 12, None)
1 (60,)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
2/5: slice(12, 24, None)
2 (60,)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
3/5: slice(24, 36, None)
3 (60,)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
4/5: slice(36, 48, None)
4 (60,)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
5/5: slice(48, 60, None)
5 (60,)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

PeterC-DLS avatar Jan 20 '25 09:01 PeterC-DLS