cloud-volume
cloud-volume copied to clipboard
Error fetching sharded data
Hi,
I'm trying to load some of the Janelia hemibrain EM image data but run into this issue:
>>> vol = CloudVolume('precomputed://gs://neuroglancer-janelia-flyem-hemibrain/emdata/clahe_yz/jpeg', fill_missing=True)
>>> vol[6300:6500, 20400:20600:, 14000]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-59-f9e19f62db12> in <module>
----> 1 vol[6300:6500, 20400:20600:, 14000]
~/.local/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py in __getitem__(self, slices)
~/.local/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py in download(self, bbox, mip, parallel, segids, preserve_zeros, agglomerate, timestamp, stop_layer)
~/.local/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/__init__.py in download(self, bbox, mip, parallel, location, retain, use_shared_memory, use_file, order)
~/.local/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/sharding.py in from_dict(cls, vals)
TypeError: __init__() missing 1 required positional argument: 'data_encoding'
Any idea what I'm doing wrong here?
Thanks! :)
Forgot to add: this is with cloudvolume 1.18.1
With more traceback:
<ipython-input-2-dbdeda5c0ce0> in <module>
----> 1 vol[6300:6500, 20400:20600:, 14000]
~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py in __getitem__(self, slices)
526 requested_bbox = Bbox.from_slices(slices)
527
--> 528 img = self.download(requested_bbox, self.mip)
529 return img[::steps.x, ::steps.y, ::steps.z, channel_slice]
530
~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/cloudvolume/frontends/precomputed.py in download(self, bbox, mip, parallel, segids, preserve_zeros, agglomerate, timestamp, stop_layer)
568 parallel = self.parallel
569
--> 570 img = self.image.download(bbox, mip, parallel=parallel)
571
572 if segids is None:
~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/image/__init__.py in download(self, bbox, mip, parallel, location, retain, use_shared_memory, use_file, order)
113 scale = self.meta.scale(mip)
114 if 'sharding' in scale:
--> 115 spec = sharding.ShardingSpecification.from_dict(scale['sharding'])
116 return rx.download_sharded(
117 bbox, mip,
~/.pyenv/versions/3.7.5/lib/python3.7/site-packages/cloudvolume/datasource/precomputed/sharding.py in from_dict(cls, vals)
123 vals['type'] = vals['@type']
124 del vals['@type']
--> 125 return cls(**vals)
126
127 def to_dict(self):
TypeError: __init__() missing 1 required positional argument: 'data_encoding'
I got it to work by manually setting the data_encoding:
vol.scale['sharding']['data_encoding'] = 'raw'
m = vol[6300:6500, 20400:20600:, 14000]
It is rather slow for a 200x200 cutout though:
%timeit m = vol[6300:6500, 20400:20600:, 14000]
12.8 s ± 5.04 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
Is there a way I can speed it up?
It looks like Neuroglancer defaults to using RAW for data encoding, so whom am I to judge?
https://github.com/google/neuroglancer/blob/1768c2271a3623264063173f4ba96b2013f8129d/src/neuroglancer/datasource/precomputed/frontend.ts#L344-L347
I'm fixing this in PR #356.
Is it still slow for you? The same code is taking 2.7 sec for me.