pynwb
pynwb copied to clipboard
Support for flat-binary backend
Some users would like to read/write data to/from flat-binary files. To support this, an extension of FORMIO must be created. Before doing this, details on how data gets stored need to be flushed out. As a first pass, I propose the following specification:
- HDF5 datasets are stored as flat-binary files
- HDF5 groups are created as folders
- JSON files are used to stored HDF5 attributes
- HDF5 links are stored as soft-links
- HDF5 references stored using some form of OID?
- This would require more infrastructure development
- HDF5 region references stored using same mechanism as HDF5 references (see 5)
See also #230 . Ultimately this may or may not be relevant, but it looks like Zarr might be similar enough to h5py that one could drop it in to test how this could work.
Another project that might be useful for this is exdir: http://exdir.readthedocs.io
@kdharris101 also suggested ALF https://github.com/cortex-lab/alf2neuroscope#what-is-alf as a relevant standard for structuring storage of flat binary files. Here also a related issue ticket on the nwb-schema repo: https://github.com/NeurodataWithoutBorders/nwb-schema/issues/57
exdir just came across my radar; @jeffteeters do you have any experience with it? Looks great!
I don't have any experience with it. I only recently learned about it.
- HDF5 links are stored as soft-links
This will not work on a typical windows.
I've been looking more closely at exdir & I'll add a +1 to @jeffteeters's suggestion.
They seem to have done most of the hard work here & the api is meant to be a drop-in for h5py, so it should be fairly easy to port the backend.
There isn't currently support for links but there is an open issue on their repo: https://github.com/CINPLA/exdir/issues/1
I feel like this is handled by Zarr. @oruebel , good to close?
Yes, this is being addressed by https://github.com/hdmf-dev/hdmf-zarr