netcdf-c Consolidated Zarr support could improve S3 data loading

Hello 👋

We've noticed the difference between reading a remote Zarr dataset [https://...#mode=s3,zarr] and local one [file://....#mode=file,zarr]:

$ time ncdump/ncdump -v tas file://${HOME}/${DATASET}/#mode=zarr | tail -n+2 | md5sum
abd28bc55fb9d0c25a3767a43d27110a -
real 0m0.111s
user 0m0.104s
sys 0m0.017s

$ time ncdump/ncdump -v tas https://${ENDPOINT}/${BUCKET}/${DATASET}/#mode=zarr,s3 | tail -n+2 | md5sum
abd28bc55fb9d0c25a3767a43d27110a -
real 0m9.854s
user 0m4.739s
sys 0m0.162s

Network overhead is expected, specially if the service imposes rate limits. But such a difference motivated me to look at the implementation behaviour.

It seems that the approach used by netcdf is similar to the one used with Python Zarr, fetching all the metadata in advance. And for this reason the following requests are sent (in netcdf) for my the example above:

4 GET Requests to list the dataset metadatafiles
674 HEAD Requests to mainly fetch the size of the object to be transfered
224 GET Requests to actually read the content of both metadata (223) and data/chunks (1) objects.

There are 3x more HEAD than GET which can be a tinny improvement, but overall this is not much different from what Python does:

import zarr,s3fs,os
s3 = s3fs.S3FileSystem(endpoint_url=f'https://{os.environ["ENDPOINT"]}/',anon=True)
store = s3fs.S3Map(root=f'{os.environ["BUCKET"]}/{os.environ["DATASET"]}',s3=s3)
d = zarr.open(store,'r')
print(d.info)

Which produces:

127 GET Requests to list metadatafiles or variable names after the dataset prefix
349 HEAD Requests to check for metadatafiles

Implementing a consolidated access mode could improve the situation. In Python, the example above can be simplified to a single request:

1 GET Request to fetch the content of /.zmetadata (Note that not even a HEAD request is done in advance)

import zarr,s3fs,os
s3 = s3fs.S3FileSystem(endpoint_url=f'https://{os.environ["ENDPOINT"]}/',anon=True)
store = s3fs.S3Map(root=f'{os.environ["BUCKET"]}/{os.environ["DATASET"]}',s3=s3)
d = zarr.open_consolidated(store)
print(d.info)

If this is desired perhaps it could be supported by other modes, like file (or even zip!?) as well. In that case I think it would be part of the zarr api and not a specific zmap S3 implementation.

I will try to come up with a PR for this but it would be great to have some feedback and if positive, some pointers/draft on how to support it (via #mode=consolidated controls? Environment? Only when build --with-consolidated-zarr?)

Thanks!

Aug 19 '24 13:08 mannreis

:+1: for support of Zarr v2 "consolidated". A discussion on the zarr-python team yesterday touched on how to deal with the potential differences in this respect between possible differences in the definition of consolidated between the v2 and v3 formats. The decision was to add arguments to enable the v2 consolidated format to the v3 library, but potentially disallow those arguments when producing the v3 format (since the v3 library will need to support both the v2 and v3 formats.) </tongue_twister>

Aug 26 '24 14:08 joshmoore

Sorry, I apparently missed this Issue when it was first posted. In any case, we have always planned to support consolidated metadata for both V2 and V3. The problem was that there appeared to be no specification for the JSON for consolidated metadata. Has that changed? Can you point me to that spec? Josh's note about V3 supporting both V2 and V3 is unclear. I get that the actual (python) library will need to read files in both V2 and V3 formats. But I do not understand this remark:

The decision was to add arguments to enable the v2 consolidated format to the v3 library, but potentially disallow those arguments when producing the v3 format

What kind of arguments are being considered?

Aug 26 '24 17:08 DennisHeimbigner

The problem was that there appeared to be no specification for the JSON for consolidated metadata. Has that changed? Can you point me to that spec?

No, that has not changed, but agreed that that is a difficult for the v2 format.

What kind of arguments are being considered?

Correction. For zarr-python library v2, I should have said "methods" or "API" for activating consolidated metadata. (Those don't yet exist for zarr-python library v3.) The method arguments I was thinking of are in xarray: https://docs.xarray.dev/en/stable/user-guide/io.html#consolidated-metadata

Aug 26 '24 19:08 joshmoore

Ok, I see. So the big holdup at the moment is a JSON spec for consolidated metadata for V2 and another for V3.

Aug 26 '24 19:08 DennisHeimbigner

The discussion around V3 is currently ongoing. It's unlikely that there will be significant work on a V2 "spec". (I would certainly be for having an "upgrade guide" between the two which may be as close as we can come.)

Aug 26 '24 19:08 joshmoore

Thanks for the discussion! In the meantime I've tried to just add a "caching layer" the metadata functions that would GET the .z* files to see what the difference would be [1] . I've opened #2992 but its a draft perhaps not useful on the long term.

[1]

$ time ncdump/ncdump -v tas https://${ENDPOINT}/${BUCKET}/${DATASET}/#mode=zarr,s3,consolidated | tail -n+2 | md5sum
abd28bc55fb9d0c25a3767a43d27110a -
real	0m0.262s
user	0m0.155s
sys	0m0.022s

Aug 27 '24 07:08 mannreis