Tom Nicholas

Results 182 issues of Tom Nicholas

> We should probably make Cubed store its intermediate data in a directory named `{CONTEXT_ID}/{compute_id}`, but that's a bit more work. _Originally posted by @TomNicholas in https://github.com/cubed-dev/cubed-benchmarks/pull/10#discussion_r1513284448_

I got the first example to run :champagne: `python examples/lithops-add-asarray.py "s3://cubed-$USER-temp" cubed-runtime` But to get it to run on AWS (I don't have a Modal account) I did have to...

documentation

Cubed arguably has enough information to give a rough estimate of the monetary cost of executing the plan before starting execution. I'm imagining a new method `.estimate_cost(executor)` that is similar...

enhancement

Cubed currently always implements the shuffle operation as an all-to-all rechunking using the [algorithm from rechunker](https://rechunker.readthedocs.io/en/latest/algorithm.html). This creates an intermediate persistent Zarr store, and requires all chunks to be written...

help wanted
primitive
optimization

I tried to re-run the quadratic means example with recent improvements to Cubed but got stuck on a Lithops version mismatch error ``` Exception: Lithops version mismatch. Host version 2.9.0...

I expect you're already aware of this @tomwhite , but I wanted to ask whether or not you thought the [google-tensorstore project](https://ai.googleblog.com/2022/09/tensorstore-for-high-performance.html) might be useful in cubed. @rabernat [suggested](https://discourse.pangeo.io/t/google-tensorstore-3d-data-package/2778) benchmarking...

zarr

It would be nice to add `map_overlap` alongside `map_blocks`, `blockwise`, `rechunk`, and `apply_gufunc`. It's currently not directly used within xarray (even within `xarray.map_blocks`, which builds a HLG), but maybe it...

enhancement

There are a few numpy functions which xarray calls on wrapped arrays but which are not (yet) in the Array API Standard. (See https://github.com/data-apis/array-api/issues/187#issuecomment-1553615779) Cubed could choose to implement these...

enhancement
array api
xarray-integration

All intermediate results in Cubed are written out to persistent storage via Zarr, but currently Zarr can't represent any chunked array, because the Zarr spec does not yet support irregular...

enhancement
zarr
upstream

See image for demonstration. ![Screenshot from 2023-03-14 19-26-13](https://user-images.githubusercontent.com/35968931/225164219-0125df14-e3f9-46ee-85c2-8ec523093ec1.png) `np.nanmean` is called by xarray's `.mean()` method when `skipna=True`, which is the default.