zarr-python
zarr-python copied to clipboard
Failure to list keys on remote HTTP store
Zarr version
main
Numcodecs version
0.16.3
Python Version
3.13
Operating System
mac
Installation
pep-723
Description
Opening a remote zarr store over https can silently fails to list keys
Steps to reproduce
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# "fsspec",
# "requests",
# "aiohttp",
# "ome-zarr"
# ]
# ///
"""
Minimal reproducer for Zarr remote vs local group listing issue.
This demonstrates that zarr.Group.keys() returns empty list for remote
stores (FsspecStore over HTTP) but works correctly for local stores,
even though direct access (group['0']) works in both cases.
"""
from pathlib import Path
import zarr
url = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr"
download_dir = Path("data")
local_path = download_dir / "6001240_labels.zarr"
if not local_path.exists():
import ome_zarr.utils
print(f"Downloading {url}...")
download_dir.mkdir(parents=True, exist_ok=True)
ome_zarr.utils.download(url, str(download_dir))
print("Download complete!")
print("\n\n\n")
# actual reproducer
zarr.print_debug_info()
# Test Remote
print("\n1. REMOTE store (FsspecStore over HTTP)")
remote_group = zarr.open_group(url, mode="r")
remote_keys = list(remote_group.keys())
print(f" keys() → {remote_keys}")
print(f" Direct access group['0'] → {type(remote_group['0']).__name__}")
# Test Local
print("\n2. LOCAL store (LocalStore)")
print(f" Path: {local_path}")
local_group = zarr.open_group(str(local_path), mode="r")
local_keys = list(local_group.keys())
print(f" keys() → {local_keys}")
print(f" Direct access group['0'] → {type(local_group['0']).__name__}")
# Show the bug
print("\n" + "=" * 80)
print("BUG: Remote keys() returns empty but direct access works!")
print("=" * 80)
print(f"Remote keys(): {remote_keys} (WRONG - should match local)")
print(f"Local keys(): {local_keys}")
print(f"\nBoth can access group['0']: ✓")
print("\nThis breaks xarray when iterating groups for DataTree.")
Additional output
1. REMOTE store (FsspecStore over HTTP)
keys() → []
Direct access group['0'] → Array
2. LOCAL store (LocalStore)
Path: data/6001240_labels.zarr
keys() → ['0', '1', 'labels', '2']
Direct access group['0'] → Array
================================================================================
BUG: Remote keys() returns empty but direct access works!
================================================================================
Remote keys(): [] (WRONG - should match local)
Local keys(): ['0', '1', 'labels', '2']
Both can access group['0']: ✓
This breaks xarray when iterating groups for DataTree.
I think a big part of the issue here is that that URL points to an S3 store, not a https store, but it's getting called an http store by url_to_fs from fsspec.
Two aciton iitems i think:
- if we end up in this state zarr should fail proeprly instead of just returning nothing
- better detection of this as s3?
in general http-backed storage is not guaranteed to support directory listing. But for ome-zarr, this is not so important, because the multiscales attribute of an ome-zarr mulitscale group contains the names of all the scale levels, so you don't actually need directory listing to traverse the zarr nodes relevant to the format.