zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Zarr over https returns an empty string in `keys()`

Open b8raoult opened this issue 2 months ago • 6 comments

Zarr version

3.1.3

Numcodecs version

0.15.1

Python Version

3.12.9

Operating System

Mac

Installation

uv pip install zarr

Description

The following code:

import zarr, numcodecs

z = zarr.open('https://data.ecmwf.int/anemoi-datasets/era5-o96-1979-2023-6h-v8.zarr', mode='r')
print(list(z.keys()))

will show that one of the keys is the empty string ''. This is likely because the directory listing links to itself.

See https://data.ecmwf.int/anemoi-datasets/era5-o96-1979-2023-6h-v8.zarr

Steps to reproduce

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues

import zarr
# your reproducer code
# zarr.print_debug_info()

Additional output

No response

b8raoult avatar Nov 09 '25 12:11 b8raoult

directory listing over http is not to my knowledge standardized, so we are relying on fsspec's heuristics, which iirc involve assuming that a request against "path/" follows the static file server contention of returning a list of links to path/a, path/b, etc. I'm not sure if this is something zarr or fsspec should fix. On our side, we could develop a special storage backend for http storage that deals with this.

d-v-b avatar Nov 09 '25 12:11 d-v-b

I understand that. I just wanted to report it. I'll exclude the empty string in my code.

b8raoult avatar Nov 09 '25 12:11 b8raoult

we should definitely not be conveying the existence of keys that do not exist, so as a short-term fix we should probably sanitize the results of directory listing

d-v-b avatar Nov 09 '25 12:11 d-v-b

Please note that zarr 2.18.7 does not return that empty string.

b8raoult avatar Nov 09 '25 13:11 b8raoult

But, zarr2 did return .zgroup. I expect keys() to return all arrays and sub-groups.

b8raoult avatar Nov 09 '25 14:11 b8raoult

one step forward, one step back

d-v-b avatar Nov 09 '25 14:11 d-v-b