cubed icon indicating copy to clipboard operation
cubed copied to clipboard

Implement boolean array indexing

Open tomwhite opened this issue 3 years ago • 2 comments

https://data-apis.org/array-api/latest/API_specification/indexing.html#boolean-array-indexing

The output has a data-dependent output shape, and it's difficult to implement since each chunk can be an arbitrary size, yet Zarr doesn't support non-regular chunk sizes.

It may be simplest to wait until non-regular chunking is available in Zarr 3.0 before doing this. There's a sketch of how this might be implemented here: https://github.com/zarr-developers/zarr-specs/issues/49#issuecomment-611473418

Using Zarr 3.0 non-regular chunking could be an implementation detail, by ensuring that it is only used for intermediate data, and that the final output is always a Zarr 2.0 array with regular chunking - at least until Zarr 3.0 is more widely supported.

tomwhite avatar Aug 02 '22 15:08 tomwhite

Implementing nonzero and the unique_* functions (https://github.com/tomwhite/cubed/blob/main/api_status.md) also need support for non-regular chunking.

tomwhite avatar Aug 02 '22 15:08 tomwhite

It looks like Xarray does not yet have this feature either? https://github.com/pydata/xarray/issues/1887

hammer avatar Sep 29 '22 15:09 hammer