zarr-python
zarr-python copied to clipboard
Feat/write empty chunks
This PR adds a boolean array.write_empty_chunks
value to the global config, and uses this value to control whether chunks that are "empty", i.e. filled with values equivalent to the array's fill value, are written to storage.
In zarr-python
2.x, write_empty_chunks
was a property of an Array
that users specified when creating the Array
object. This had pros and cons which I'm happy to discuss if people are interested, but the tl;dr is that the cons of that approach are driving my decision in this PR to make write_empty_chunks
a global runtime property accessible via the config API.
Usage looks something like this (donfig
experts please correct me if there's a better way):
with config.set({"array.write_empty_chunks": write_empty_chunks}):
arr[:] = fill_value
If people hate this, then we can definitely change this API. I'm very open to discussion here.
Also worth noting:
Our check for whether a chunk is equal to the fill value is pretty inefficient -- it's allocating a new array for every check invocation. This can definitely be made more efficient, in a stupid way by caching an all-fill-value chunk on the array instance and using that for the comparison, or a smarter way by doing the (chunk, fill_value)
comparison without allocating a new array. But I think this is an effort for a separate PR.
closes #2409
TODO:
- [ ] Add unit tests and/or doctests in docstrings
- [ ] Add docstrings and API docs for any new/modified user-facing classes and functions
- [ ] New/modified features documented in docs/tutorial.rst
- [ ] Changes documented in docs/release.rst
- [ ] GitHub Actions have all passed
- [ ] Test coverage is 100% (Codecov passes)