h5pyd
h5pyd copied to clipboard
Variable length array
I'm creating a backend to use transparently pytables, h5py and h5pyd. I ran my test suite on h5pyd and am confronted with the issue of the variable length array. In h5py, they manage to do it using a "special" dtype: h5py.string_dtype
or h5py.vlen_dtype
. After some digging in I found in h5pyd, the function special_dtype
where the docstring seems promising:
vlen = basetype
Base type for HDF5 variable-length datatype. This can be Python
str type or instance of np.dtype.
Example: special_dtype( vlen=str )
however after trying it out it seems its working only in the case where vlen=str
and not any numpy dtype. Using a special dtype of np.uint32s I could create a dataset but when trying to access a given element I got this traceback:
File "<ipython-input-114-66578725db5f>", line 1, in <module> dset[0] File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\dataset.py", line 802, in __getitem__ arr1d = bytesToArray(rsp, mtype, page_mshape) File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\base.py", line 503, in bytesToArray offset = readElement(data, offset, arr, index, dt) File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\base.py", line 467, in readElement arr[index] = vlen(0) TypeError: 'numpy.dtype' object is not callable
Then a bit further in the code I found :
def check_dtype(**kwds): """ Check a dtype for h5py special type "hint" information. Only one keyword may be given.
vlen = dtype If the dtype represents an HDF5 vlen, returns the Python base class. Currently only builting string vlens (str) are supported. Returns None if the dtype does not represent an HDF5 vlen.
So the question is: is it or will it be possible to use any numpy dtype for variable length arrays in h5pyd?
Thx
After some more reading, your special_type function is same as in the older h5py API (that is before version h5py 2.9). Well that is just different names for same functionality except that in h5pyd, numpy special types are not working...yet?
Hey - sorry somehow I missed this issue till now...
You can use h5pyd.special_dtype with numpy types like this example: https://github.com/HDFGroup/h5pyd/blob/master/test/hl/test_vlentype.py#L50.
There's also support for the new api: vlen_dtype as decribed here: ,https://github.com/h5py/h5py/pull/1132.
E.g.: https://github.com/HDFGroup/h5pyd/blob/master/test/hl/test_dataset.py#L1640.
The only special type missing is for regionrefs - which hopefully will get added soon.
I'll leave this issue open as a reminder to remove the old-style check_dtype, special_dtype functions since they are not in h5py anymore.