cubed
cubed copied to clipboard
Bounded-memory serverless distributed N-dimensional array processing
Part of #675 that I wrote on the way to AGU last year
Implement the [Fourier transform extension](https://data-apis.org/array-api/latest/extensions/fourier_transform_functions.html) from the [Python array API standard](https://data-apis.org/array-api/latest/index.html).
https://github.com/cubed-dev/cubed/blob/5f75ba24544fb3f16ab7b8f477334fa47da1a1b2/cubed/array_api/manipulation_functions.py#L546 A pattern for tracking work I really enjoy is associating `TODO`s in code with github issues. Even if these are the lowest priority, it's still nice to associate code...
@betolink was asking if there is a way to keep track in Dask of the accumulative network traffic for pulling data, but I think Cubed should have enough information to...
This is the example I was working on at the [Post-AGU Pangeo Hack Day](https://discourse.pangeo.io/t/post-agu-pangeo-hack-day-working-meeting-december-14-2024-in-washington-dc/4440) on Saturday. The idea is to take multiple NetCDF files, combine with VirtualiZarr, and then rechunk...
Maybe useful? Could simply roll back to state before failed stage. Also then it's Icechunk 's problem to worry about atomic writes... Idea from: A reason to not run backup...
The `region`/`regions` parameter allows writing into a portion of the target Zarr array(s). Also for Icechunk. From https://github.com/cubed-dev/cubed/pull/633#discussion_r1869533923
Even when all the tests pass, the build doesn't exit immediately and will be cancelled a few hours later (e.g. https://github.com/cubed-dev/cubed/actions/runs/11974180543/job/33384715251?pr=621).
See https://github.com/cubed-dev/cubed/actions/runs/12164228442/job/33925369770 For example (running locally): ``` pytest -vs 'cubed/tests/test_array_api.py::test_astype[dask]' ========================================================== test session starts =========================================================== platform darwin -- Python 3.10.15, pytest-8.3.4, pluggy-1.5.0 -- /Users/tom/miniforge3/envs/cubed-dask-m1/bin/python3.10 cachedir: .pytest_cache rootdir: /Users/tom/workspace/cubed configfile: pyproject.toml...
From @dcherian in https://github.com/cubed-dev/cubed/pull/636#discussion_r1869951520 > https://github.com/pydata/xarray/blob/99ee8c6ca54057a9b994d7685f36236f2d5a69d9/xarray/core/indexing.py#L1084 and friends. We rewrite the query to normal slice with +ve stride, then reverse in-memory after read :) > > This is one of...