zarr-python
zarr-python copied to clipboard
[v3] revisit runtime config
This issue tracks a evaluation of the v3 runtime config
Context
The v3 branch runtime config currently looks like this:
https://github.com/zarr-developers/zarr-python/blob/76c345071db950b2362f7588ad20da4a1af03b85/src/zarr/v3/config.py#L34-L38
This is then attached to Array/Group classes
https://github.com/zarr-developers/zarr-python/blob/76c345071db950b2362f7588ad20da4a1af03b85/src/zarr/v3/array.py#L51-L55
A few things are missing here:
- User experience
- as a user, I may want to set config settings and forget about them (e.g. order, concurrency)
- Portability
- I don't know for sure but I really doubt that putting the AsyncIO loop on the Array class is going to work when it comes to serialization
Improvements
So looking for some ideas for how to manage this better. Two ideas:
- Xarray style
set-options: https://docs.xarray.dev/en/stable/generated/xarray.set_options.html- Pros: allows for validation and is typed
- Cons: a bit bespoke, doesn't support environment variables or a config file option
- Dask style config - https://donfig.readthedocs.io/en/latest/
- Pros: very flexible framework, support for environment variables and config files, nested namespaces, etc.
- Cons: extra dependency (though we could vendor it), no typing or validation
what do we expect to go in the runtime config?
- Order
- Concurrency
- logging settings
- what else?
I spoke with @maxrjones today about this. Our thought for now was to try using donfig and see how it goes. We can continue to evaluate the dependency vs vendoring and typing/validation as needed.
cc @djhoese
Additional config options:
- Specify alternate implementations for
CodecPipeline(e.g. for a rust-based codec pipeline) - Specify alternate implementations for codecs (e.g. for GPU-based batch-aware codecs)
- Batch size in the
HybridCodecPipeline
thanks @normanrz! Joe mentioned you asked about this today. I'm working on getting a minimal PR opened now and should have that submitted within the next couple hours.
@normanrz - #1855 is now in the v3 branch. Should clear the way to add additional config options as needed.
Thanks @maxrjones for getting this moving!
I think this is a great way of dealing with configurations. Thanks!