zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Future of open...() functions

Open alimanfoo opened this issue 6 years ago • 1 comments

Currently there are convenience functions open_array(...) and open_group(...), as well as a multi-purpose open(...).

These functions are a bit of a hangover from the earlier days of Zarr when there were only two storage options, either in memory (dict) or on disk (DirectoryStore). The open...() functions were meant to provide convenience for users who wanted to store data on disk, and who may be coming from h5py or bcolz and so were previosly familiar with a file mode semantics for whether data should be read-only, overwritten, etc. E.g., zarr.open_group(...) is analogous to h5py.File(...).

E.g., the following code for creating a new on-disk array:

z = zarr.open_array('/path/to/array.zarr', mode='w', shape=100, dtype='f8')

...is just meant as syntactic sugar for:

store = zarr.DirectoryStore('/path/to/array.zarr')
z = zarr.create(store=store, shape=100, dtype='f8', overwrite=True)

These days, with more storage options, it may be better to recommend using this second, longer form, because it shows the general pattern for how to instantiate a store then create an array, so it's more obvious how to then adapt the code to using a different storage class.

Also the open...() functions can be a little confusing, e.g. #100.

So I think I'm proposing to modify the tutorial so that the open...() functions are not shown any more, and all examples use the slightly longer syntax where a store is explicitly created first.

I'm not proposing to deprecate the open...() functions, at least not at this stage, as they still may be useful to some users.

alimanfoo avatar May 24 '18 12:05 alimanfoo

xref: #1598, cc @aldenks

jhamman avatar Feb 03 '24 03:02 jhamman