v2.metadata and v3.metadata encode `fill_value` bytes differently
Here I am creating an array and specifying the fill_value as raw bytes b'X'
import zarr
fv = b'X'
a = zarr.create(shape=10, dtype=bytes, zarr_version=2, fill_value=fv)
ad = a.metadata.to_dict()
print(ad)
# -> {'shape': (10,), 'fill_value': 'WA==', 'attributes': {}, 'zarr_format': 2, 'order': 'C', 'filters': None, 'dimension_separator': '.', 'compressor': None, 'chunks': (10,), 'dtype': '|S0'}
b = zarr.create(shape=10, dtype=bytes, zarr_version=3, fill_value=fv)
bd = b.metadata.to_dict()
print(bd)
# -> {'shape': (10,), 'fill_value': (88,), 'chunk_grid': {'name': 'regular', 'configuration': {'chunk_shape': (10,)}}, 'attributes': {}, 'zarr_format': 3, 'data_type': <DataType.bytes: 'bytes'>, 'chunk_key_encoding': {'name': 'default', 'configuration': {'separator': '/'}}, 'codecs': ({'name': 'vlen-bytes', 'configuration': {}},), 'node_type': 'array', 'storage_transformers': ()}
assert zarr.core.metadata.v2.ArrayV2Metadata.from_dict(ad).fill_value == fv
assert zarr.core.metadata.v3.ArrayV3Metadata.from_dict(bd).fill_value == fv
As we can see, the way this fill value is encoded looks quite different from these two. Remarkably, it gets translated back to something reasonable in both cases.
In both cases, the bytes are going through this path: https://github.com/zarr-developers/zarr-python/blob/aa46b451ae6a83e1befc2525ec9629953949aa79/src/zarr/abc/metadata.py#L33-L34
This converts the bytes to a tuple of ints.
However, for v2, #2286 added this additional special handling for fill_value:
https://github.com/zarr-developers/zarr-python/blob/aa46b451ae6a83e1befc2525ec9629953949aa79/src/zarr/core/metadata/v2.py#L146-L150
According to the V3 spec:
Raw data types (r<N>) An array of integers, with length equal to <N>, where each integer is in the range [0, 255].
This seems in line with what is happening.
This is relevant to https://github.com/pydata/xarray/issues/5475