zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

consider widening the accepted JSON inputs for fill values

Open d-v-b opened this issue 5 months ago • 5 comments

we are currently pretty strict about the type of fill values -- for an int data type, the json string "0" is not a valid fill value, because it's a string, which is not an int. Mixing up ints for stringified ints is a pretty common mistake, so we should consider allowing anything remotely int-like to be a fill value for an int.

a caveat of doing this is that metadata documents cannot be guaranteed to round-trip through zarr python, because an array with an integer dtype and a fill value of "0" would be resaved with the fill value 0. if that's really important, we could consider a "strict" mode that does guarantee round-tripping.

see this zulip thread#NGFF > N5 library produces zarr that is not readable in python for context

d-v-b avatar Aug 07 '25 11:08 d-v-b

This will necessary when zarr3 's list supported data types expand to include non-numeric data types (e.g. string, which zarr v2 supports)

bogovicj avatar Aug 07 '25 13:08 bogovicj

those data types are already supported as extension data types , so I don't think there's a push to put them in the core v3 spec

(edit: I'm assuming by "zarr3" you mean the spec. zarr python has supported variable-length strings for zarr v3 data since a long time, and as of 3.1 it supports fixed-length strings as well)

d-v-b avatar Aug 07 '25 14:08 d-v-b

in the case of strings, we require the JSON form of the fill value to be a string, but this also might be too strict

d-v-b avatar Aug 07 '25 14:08 d-v-b

ok it looks like our check for fixed-length strings is strict, but for variable-length strings we accept all kinds of stuff:

https://github.com/zarr-developers/zarr-python/blob/926a52fa11845a142f65b10f13c1d9a92c754e6b/src/zarr/core/dtype/npy/string.py#L407-L424

d-v-b avatar Aug 07 '25 14:08 d-v-b

A quick reproducer that i think is related to this:

uvx --from ome-zarr ome_zarr download https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.1/6001239.zarr

gives:

WARNING:ome_zarr.io:version mismatch: detected: FormatV01, requested: FormatV05
Traceback (most recent call last):
  File "/Users/ian/.local/share/uv/tools/ome-zarr/bin/ome_zarr", line 10, in <module>
    sys.exit(main())
             ~~~~^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/cli.py", line 210, in main
    ns.func(ns)
    ~~~~~~~^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/cli.py", line 50, in download
    zarr_download(args.path, args.output)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/utils.py", line 322, in download
    for node in reader():
                ~~~~~~^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/reader.py", line 574, in __call__
    node = Node(self.zarr, self)
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/reader.py", line 53, in __init__
    self.specs.append(Multiscales(self))
                      ~~~~~~~~~~~^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/reader.py", line 297, in __init__
    data: da.core.Array = self.array(resolution)
                          ~~~~~~~~~~^^^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/reader.py", line 320, in array
    return self.zarr.load(resolution)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/ome_zarr/io.py", line 145, in load
    return da.from_zarr(self.__store, subpath)
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/dask/array/core.py", line 3792, in from_zarr
    z = zarr.open_array(store=url, path=component, **kwargs)
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/api/synchronous.py", line 1355, in open_array
    sync(
    ~~~~^
        async_api.open_array(
        ^^^^^^^^^^^^^^^^^^^^^
    ...<6 lines>...
        )
        ^
    )
    ^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/sync.py", line 163, in sync
    raise return_result
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/sync.py", line 119, in _runner
    return await coro
           ^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/api/asynchronous.py", line 1281, in open_array
    return await AsyncArray.open(store_path, zarr_format=zarr_format)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/array.py", line 991, in open
    return cls(store_path=store_path, metadata=_metadata_dict)
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/array.py", line 333, in __init__
    metadata_parsed = parse_array_metadata(metadata)
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/array.py", line 193, in parse_array_metadata
    return ArrayV2Metadata.from_dict(data)
           ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/metadata/v2.py", line 169, in from_dict
    fill_value = dtype.from_json_scalar(fill_value_encoded, zarr_format=2)
  File "/Users/ian/.local/share/uv/tools/ome-zarr/lib/python3.13/site-packages/zarr/core/dtype/npy/int.py", line 212, in from_json_scalar
    raise TypeError(f"Invalid type: {data}. Expected an integer.")
TypeError: Invalid type: 0. Expected an integer.

ianhi avatar Sep 23 '25 18:09 ianhi