xarray icon indicating copy to clipboard operation
xarray copied to clipboard

variable-sized chunks with zarr v3

Open keewis opened this issue 2 months ago • 2 comments

  • [ ] Closes #xxxx
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst

Building on top of zarr-developers/zarr-python#3534, this is a draft PR that allows writing variable-sized chunks to zarr.

To see this in action, try:

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray @ git+https://github.com/keewis/xarray.git@variable-chunking",
#   "zarr @ git+https://github.com/jhamman/zarr-python.git@feature/rectilinear-chunk-grid",
# ]
# ///

import numpy as np
import xarray as xr

rng = np.random.default_rng(seed=0)
values = rng.normal(size=(365, 20))

ds = xr.Dataset(
    {"a": (["time", "x"], values)},
    coords={"time": xr.date_range("2025-01-01", freq="d", periods=365)}
)
chunked = ds.chunk({"time": xr.groupers.TimeResampler(freq="ME"), "x": 10})

chunked.to_zarr(
    "variable_chunks.zarr",
    mode="w",
    safe_chunks=False,
    zarr_format=3,
    consolidated=False,
)

ds = xr.open_dataset(store, engine="zarr", chunks={})
print(ds.chunksizes)
# Frozen({'time': (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31), 'x': (10, 10)})

At the moment, this requires safe_chunks=False because I didn't change the chunk alignment machinery, yet.

cc @d-v-b, @jhamman, @dcherian

keewis avatar Oct 27 '25 16:10 keewis

We need zarr-python>=3, which doesn't work with @jhamman's fork because it doesn't have tags for versions above 3.0.0b2

I just pushed tags to my fork!

jhamman avatar Oct 27 '25 16:10 jhamman

thanks, I've changed the example back to using your fork

keewis avatar Oct 27 '25 16:10 keewis