Minari icon indicating copy to clipboard operation
Minari copied to clipboard

[Question] is hdf5 the best format for data storage?

Open Howuhh opened this issue 1 year ago • 4 comments

Question

While hdf5 and h5py is the most popular approach for multi-dimensional arrays storage, is has some major limitations. For example, the inability to read data in multiple processes / threads simultaneously, which can be important for the implementation of efficient data loading.

There is an alternative - Zarr, which is very similar, but a bit more capable. I think a discussion on this would be useful to the community.

Howuhh avatar Jun 23 '23 16:06 Howuhh