cubed
cubed copied to clipboard
Bounded-memory serverless distributed N-dimensional array processing
See image for demonstration.  `np.nanmean` is called by xarray's `.mean()` method when `skipna=True`, which is the default.
Lithops uses an async programming model, but not Python asyncio. It would be nice if we could bridge the two, as then we'd be able to use the common asyncio...
Add a GH Actions workflow that runs Cubed tests using Zarr V3 storage. This will ensure that the upcoming work to update Zarr to support the V3 spec (see https://github.com/zarr-developers/zarr-python/discussions/1480)...
(I wrote this to help track what works needs to be done on the executors, but it might be useful to add to the user docs at some point.) This...
#221 introduced `merge_chunks`, a special-case of `rechunk` that can be implemented using `blockwise`. I noticed that whilst `reduction` calls `merge_chunks` directly, inside `ops.rechunk` the [primitive rechunk is always called](https://github.com/tomwhite/cubed/blob/93ad984e7b0445164ab11b3c3f3a3b7db6c3bc97/cubed/core/ops.py#L631C11-L631C11). Shouldn't...
Consider a simple blockwise operation with one input, where each task carries out the following steps: 1. read compressed Zarr chunk 2. decompress Zarr chunk to produce the input array...
See https://github.com/tomwhite/cubed/issues/284#issuecomment-1660425647
If you're looking for something to do :), then scans would be a good thing to add. Dask calls this "cumreduction" (terrible name!) : and its a quite useful primitive...
It would be useful to provide numbers for actual resources used after a computation is complete so its cost can be calculated. We probably need: 1. total worker seconds 2....