zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

feat: change array creation signature to allow sharding specification [do not merge]

Open d-v-b opened this issue 5 months ago • 0 comments

The goal of this PR is to demonstrate one strategy to simplify the creation of arrays that use sharding. Don't consider merging this until we get a good look at some alternatives.

This PR alters the Array.create routine, removing the chunk_shape kwarg and instead beefing up the semantics of the chunks kwarg. Specifically, the chunks kwarg supports a new variant, ChunkSpec, which aims to compactly specify both the chunk shape of an array as well as the (optional) sub-chunk shape.

ChunkSpec is a typed dictionary with two keys: read_shape and write_shape. write_shape specifies the shape of array chunks that can be written concurrently, i.e. the shape in array coordinates of the chunk files. read_shape specifies the shape of array chunks that can be read concurrently, i.e. the shape in array coordinates of the sub-chunks contained in a chunk constructed with a sharding codec.

  • passing chunks = None or chunks = {} (we support the latter case because of how non-total typeddicts work) to Array.create will automatically specify chunks using old v2 logic.
  • passing chunks = {'write_shape': (20, 20)} OR chunks = {'read_shape': (20, 20)} to Array.create will configure that array with no sharding and a chunk size of (20,20).
  • passing chunks = {'write_shape': (20, 20), 'read_shape': (10,10)} to Array.create will configure that array with sharding, with a sub-chunk size of (10,10), and a chunk size of (20,20). This will also route all the of the user-specified codecs, if any, to the sharding codec.

Note that this PR does not change the signature of the array class itself. That would be a separate effort.

addresses #2170

TODO:

  • [ ] Add unit tests and/or doctests in docstrings
  • [ ] Add docstrings and API docs for any new/modified user-facing classes and functions
  • [ ] New/modified features documented in docs/tutorial.rst
  • [ ] Changes documented in docs/release.rst
  • [ ] GitHub Actions have all passed
  • [ ] Test coverage is 100% (Codecov passes)

d-v-b avatar Sep 10 '24 19:09 d-v-b