xarray
xarray copied to clipboard
variable-sized chunks with zarr v3
- [ ] Closes #xxxx
- [ ] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in
whats-new.rst
Building on top of zarr-developers/zarr-python#3534, this is a draft PR that allows writing variable-sized chunks to zarr.
To see this in action, try:
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray @ git+https://github.com/keewis/xarray.git@variable-chunking",
# "zarr @ git+https://github.com/jhamman/zarr-python.git@feature/rectilinear-chunk-grid",
# ]
# ///
import numpy as np
import xarray as xr
rng = np.random.default_rng(seed=0)
values = rng.normal(size=(365, 20))
ds = xr.Dataset(
{"a": (["time", "x"], values)},
coords={"time": xr.date_range("2025-01-01", freq="d", periods=365)}
)
chunked = ds.chunk({"time": xr.groupers.TimeResampler(freq="ME"), "x": 10})
chunked.to_zarr(
"variable_chunks.zarr",
mode="w",
safe_chunks=False,
zarr_format=3,
consolidated=False,
)
ds = xr.open_dataset(store, engine="zarr", chunks={})
print(ds.chunksizes)
# Frozen({'time': (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31), 'x': (10, 10)})
At the moment, this requires safe_chunks=False because I didn't change the chunk alignment machinery, yet.
cc @d-v-b, @jhamman, @dcherian
We need zarr-python>=3, which doesn't work with @jhamman's fork because it doesn't have tags for versions above 3.0.0b2
I just pushed tags to my fork!
thanks, I've changed the example back to using your fork