beast icon indicating copy to clipboard operation
beast copied to clipboard

HDF5 support via h5py instead of pytables

Open karllark opened this issue 9 years ago • 5 comments

h5py support is easier to maintain and does not require the external hdf5 system library. Having to get this system library installed is one of the barriers to easy use of the BEAST. In addition, h5py may be more pythonic.

karllark avatar Dec 21 '16 18:12 karllark

requires HDF5 1.8.4 or newer, shared library version with development headers (libhdf5-dev or similar)

same as pytables. But astropy requires h5py.

mfouesneau avatar Feb 20 '17 20:02 mfouesneau

I have been using h5py for a bit, and found that it could not read the filters string from the grid attributes. Probably an instance of this issue https://github.com/h5py/h5py/issues/624

Example:

f = h5py.File('beast_example_phat_seds.grid.hd5')
g = f['grid']
g.attrs['filters']

results in

OSError                                   Traceback (most recent call last)
<ipython-input-10-0952a280e528> in <module>()
----> 1 g.attrs['filters']

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/Software/miniconda3/lib/python3.6/site-packages/h5py/_hl/attrs.py in __getitem__(self, name)
     79 
     80         arr = numpy.ndarray(shape, dtype=dtype, order='C')
---> 81         attr.read(arr, mtype=htype)
     82 
     83         if len(arr.shape) == 0:

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5a.pyx in h5py.h5a.AttrID.read()

h5py/_proxy.pyx in h5py._proxy.attr_rw()

OSError: Unable to read attribute (no appropriate function for conversion path)

Pytables does not have this problem. Of course, this example file was written by pytables, so this might be an incompatibility problem.

drvdputt avatar Apr 27 '18 12:04 drvdputt

Yep. If/when we move away from pytables, we will need to make sure all our existing hd5 files can be read by h5py. I still feel making the switch would beneficial in having less code we need to maintain ourselves, but it is non trivial.

karllark avatar Apr 28 '18 18:04 karllark

I've come across another possible reason to move away from pytables: it doesn't seem to handle closing files very well. I haven't investigated very far, but as we do production runs with large files, it could be causing extra memory usage.

In [1]: import tables

In [2]: for i in range(5):
   ...:     x = tables.open_file('14675_LMC-5665ne-12232_beast_noisemodel_bin1.g
   ...: rid.hd5')
   ...:

In [3]: exit
Closing remaining open files:
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done

lea-hagen avatar Oct 29 '19 19:10 lea-hagen

Yep. See #64. I've investigated and not managed to figure out how to close this file manually. I like the idea that this is another reason to move away from pytables. :-)

karllark avatar Oct 29 '19 20:10 karllark