cubed
cubed copied to clipboard
Use google tensorstore to read/write to zarr?
I expect you're already aware of this @tomwhite , but I wanted to ask whether or not you thought the google-tensorstore project might be useful in cubed. @rabernat suggested benchmarking its performance against zarr-python + fsspec. Presumably any improvement to the I/O speed would considerably increase the performance of cubed? Stephan recently released an xarray interface to tensorstore but it seems like that's not the right entry point for using it with cubed.
Yes, I saw this and think it could be very interesting for Cubed.
Presumably any improvement to the I/O speed would considerably increase the performance of cubed?
Hopefully, but we'd have to try it out to see.
I tried swapping out Zarr for TensorStore here: https://github.com/tomwhite/cubed/tree/tensorstore. All unit test pass, except for test_resume (since TensorStore doesn't have a way of accessing Zarr's nchunks_initialized).
There were a few challenges due to differences in handling structured arrays, which we use for reductions like mean, but it looks like things would work. We probably will want a way to specify the array storage in the spec so users can select the one they want.
I haven't done any performance testing, but the current branch is sufficiently developed to try that.