cubed
cubed copied to clipboard
Computations requiring irregularly-chunked Zarr stores
All intermediate results in Cubed are written out to persistent storage via Zarr, but currently Zarr can't represent any chunked array, because the Zarr spec does not yet support irregular chunks.
This comes up in Cubed when a computation changes the chunking of an array from regularly chunked to irregularly chunked. An important example of such a computation is groupby, see https://github.com/tomwhite/cubed/issues/223.
I'm not sure if there are any other examples of array operations that might change regular chunks into irregular ones? np.pad comes to mind?
You could implement pad in Cubed using concatenate, which already exists but has to copy the Zarr arrays to make them regularly-chunked. But if Zarr supported irregular chunking then it could be more efficient.