HighFive icon indicating copy to clipboard operation
HighFive copied to clipboard

Library appears not threadsafe

Open JSybrandt opened this issue 6 years ago • 1 comments

I have an application wherein I need to write many many H5 files, and I coordinate these tasks using openmp. I am getting a segfault when using this library within a parallel loop. The specific error changes each run, and alternates between a simple segfault, and is occasionally a double-free error. These errors disappear when wrapping the entire file creation / dataset writing process in an omp critical block

Example errors:

*** Error in `./tsvs_to_ptbg': corrupted size vs. prev_size: 0x00000000024086d0 ***
make: *** [test] Segmentation fault
make: *** [test] Segmentation fault
*** Error in `./tsvs_to_ptbg': corrupted size vs. prev_size*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2edc51ed90 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
*** Error in `./tsvs_to_ptbg': double free or corruption (out): 0x00007f2e70137680 ***
make: *** [test] Aborted

Example code:

File h5_file(hdf5_path, File::OpenOrCreate);
DataSet a_ds = h5_file.createDataSet<size_t>("/a", DataSpace::From(a));
DataSet b_ds = h5_file.createDataSet<size_t>("/b", DataSpace::From(b));
DataSet c_ds = h5_file.createDataSet<size_t>("/c", DataSpace::From(c));
a_ds.write(a);
b_ds.write(b);
c_ds.write(c);

JSybrandt avatar Jan 08 '20 21:01 JSybrandt

Sorry for letting this sit here so long!

Are you using the HDF5 library with the thead-safety feature compiled in?

See https://docs.hdfgroup.org/hdf5/v1_12/_m_t.html and the older document of https://support.hdfgroup.org/HDF5/faq/threadsafe.html

The latter suggests that HDF5's multithreading support boils down to the equivalent of using an OpenMP critical section. To achieve full parallelism, I'd recommend using something like MPI to get fully independent processes to write files.

matz-e avatar Mar 09 '22 16:03 matz-e