cubed icon indicating copy to clipboard operation
cubed copied to clipboard

Bounded-memory serverless distributed N-dimensional array processing

Results 133 cubed issues
Sort by recently updated
recently updated
newest added

See image for demonstration. ![Screenshot from 2023-03-14 19-26-13](https://user-images.githubusercontent.com/35968931/225164219-0125df14-e3f9-46ee-85c2-8ec523093ec1.png) `np.nanmean` is called by xarray's `.mean()` method when `skipna=True`, which is the default.

Lithops uses an async programming model, but not Python asyncio. It would be nice if we could bridge the two, as then we'd be able to use the common asyncio...

runtime

It should run in CI too, like the Modal one does.

runtime

Add a GH Actions workflow that runs Cubed tests using Zarr V3 storage. This will ensure that the upcoming work to update Zarr to support the V3 spec (see https://github.com/zarr-developers/zarr-python/discussions/1480)...

zarr

(I wrote this to help track what works needs to be done on the executors, but it might be useful to add to the user docs at some point.) This...

documentation
runtime

#221 introduced `merge_chunks`, a special-case of `rechunk` that can be implemented using `blockwise`. I noticed that whilst `reduction` calls `merge_chunks` directly, inside `ops.rechunk` the [primitive rechunk is always called](https://github.com/tomwhite/cubed/blob/93ad984e7b0445164ab11b3c3f3a3b7db6c3bc97/cubed/core/ops.py#L631C11-L631C11). Shouldn't...

core
optimization

Consider a simple blockwise operation with one input, where each task carries out the following steps: 1. read compressed Zarr chunk 2. decompress Zarr chunk to produce the input array...

memory
optimization

See https://github.com/tomwhite/cubed/issues/284#issuecomment-1660425647

bug
optimization

If you're looking for something to do :), then scans would be a good thing to add. Dask calls this "cumreduction" (terrible name!) : and its a quite useful primitive...

array api
core

It would be useful to provide numbers for actual resources used after a computation is complete so its cost can be calculated. We probably need: 1. total worker seconds 2....

enhancement
runtime