h5pyd
h5pyd copied to clipboard
Error pulling a 'column' directly from a table with h5pyd
h5pyd is unable to pull a "column" from a recarray/table directly.
Example code using h5py:
In [14]: with h5py.File(path, mode='r') as f:
...: sector = f['enumerations']['sector']['id']
...:
...: print(sector)
...:
[b'com' b'res' b'trans' b'ind']
Same attempt in h5pyd:
In [12]: with h5pyd.File(hsds_path, mode='r') as f:
...: sector = f['enumerations']['sector']['id']
...:
...: print(sector)
...:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-ecf8d2f2a8ec> in <module>
1 with h5pyd.File(hsds_path, mode='r') as f:
----> 2 sector = f['enumerations']['sector']['id']
3
4
~/miniconda3/lib/python3.9/site-packages/h5pyd/_hl/dataset.py in __getitem__(self, args)
862 self.log.info("binary response, {} bytes".format(len(rsp)))
863 #arr1d = numpy.frombuffer(rsp, dtype=mtype)
--> 864 arr1d = bytesToArray(rsp, mtype, page_mshape)
865 page_arr = numpy.reshape(arr1d, page_mshape)
866 else:
~/miniconda3/lib/python3.9/site-packages/h5pyd/_hl/base.py in bytesToArray(data, dt, shape)
497 for index in range(nelements):
498 offset = readElement(data, offset, arr, index, dt)
--> 499 arr = arr.reshape(shape)
500 return arr
501
ValueError: cannot reshape array of size 12 into shape (4,)
For reference, the source .h5 file is here: s3://oedi-data-lake/dsgrid-2018-efs/state_hourly_residuals/eia_annual_energy_by_sector.dsg the hsds domain is in the s3://nrel-pds-hsds/ bucket here: '/nrel/dsgrid-2018-efs/state_hourly_residuals/eia_annual_energy_by_sector.dsg'
That's a feature not yet supported on h5pyd. As a work-around you can read the desired selection into a numpy array then extract the column from that.
Thanks @jreadey, That was the work around I suggested!