h5pyd
h5pyd copied to clipboard
iterate over chunks on dataset initialization
Fixes #88 Hi, we are running into the above issue with the HSDS server returning 413 errors when we try to write HDF5 files that are around ~200MB to HSDS. We've implemented the solution suggested in the issue here and it seems to resolve the issue for us.
create_dataset
is failing for scalar datasets after this change, but I think it's uncovering an existing issue with the ChunkIterator for scalar datasets.
Traceback (most recent call last):
File "test_complex_numbers.py", line 57, in test_complex_attr
dset = f.create_dataset('x', data=5)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/h5pyd-0.8.2-py3.7.egg/h5pyd/_hl/group.py", line 338, in create_dataset
for chunk in it:
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/h5pyd-0.8.2-py3.7.egg/h5pyd/_apps/chunkiter.py", line 111, in __next__
if self._chunk_index[0] * self._layout[0] >= self._shape[0]:
IndexError: tuple index out of range
The if self._layout == ()
block in ChunkIterator.__next__
seems like it's intended to catch this case before it reaches the code in the traceback, but the chunk size tuple for a scalar dataset in HSDS currently returns (1,)
. Perhaps the check could be replaced by if self._shape == ()
?