Example zarr files for interoperability testing
Some example files to use for testing: netcdf like data: http://tinyurl.com/GLDAS-NOAH025-3H geotiff like data: https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip
netcdf-like 4-d data: http://tinyurl.com/4dzarr
Note: the bounds variables for this zarr store are weird because I used xarray and it doesn't understand bounds variables.
This is a very small, toy zarr store with no compression. It is netcdf-like: zarr_no_compression.zarr.tar.gz
Zarr store with _CRS following gdal srs-encoding. This can be drag and dropped in qgis (with gdal 3.8+, which can be installed which conda install qgis).
My initial thoughts, I have a lot more to explore here and I'll do that with examples via code and visualizations.
geotiff-like
certainly has the geotransform and the CRS, but these are not picked up and registered up GDAL (note that the Metadata is just "bag of metadata", until it's registered in the GDAL model).
## discover contents with
###gdalinfo /vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip
## trim to first two bands for compact output
gdalinfo vrt:///vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr/?bands=1,2
Driver: VRT/Virtual Raster
Files: /vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr/
Size is 256, 256
Metadata:
_CRS=PROJCS["WGS 84 / UTM zone 14N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-99],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32614"]]
_TRANSFORM={3,0,696000,0,-3,4536000}
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 256.0)
Upper Right ( 256.0, 0.0)
Lower Right ( 256.0, 256.0)
Center ( 128.0, 128.0)
Band 1 Block=32x32 Type=Int16, ColorInterp=Undefined
Metadata:
DIM_band_INDEX=0
DIM_band_VALUE=blue
DIM_datetime_INDEX=0
DIM_datetime_UNIT=days since 2021-01-01 00:00:00
DIM_datetime_VALUE=0
Band 2 Block=32x32 Type=Int16, ColorInterp=Undefined
Metadata:
DIM_band_INDEX=1
DIM_band_VALUE=green
DIM_datetime_INDEX=0
DIM_datetime_UNIT=days since 2021-01-01 00:00:00
DIM_datetime_VALUE=0
I think the transform is in "rasterio" form and not GDAL form, I think that's not a good idea but I don't know how most formats do that.
GDAL geotransform: https://gdal.org/en/latest/tutorials/geotransforms_tut.html
GT(0) x-coordinate of the upper-left corner of the upper-left pixel.
GT(1) w-e pixel resolution / pixel width.
GT(2) row rotation (typically zero).
GT(3) y-coordinate of the upper-left corner of the upper-left pixel.
GT(4) column rotation (typically zero).
GT(5) n-s pixel resolution / pixel height (negative value for a north-up image).
rasterio geotransform is a different order:
import rasterio
rasterio.Affine(3,0,696000,0,-3,4536000)
Affine(3.0, 0.0, 696000.0,
0.0, -3.0, 4536000.0)
rasterio.Affine(3,0,696000,0,-3,4536000).to_gdal()
(696000.0, 3.0, 0.0, 4536000.0, 0.0, -3.0)
crs_enabled_zarr_and_tif.zip
this one looks good
gdalinfo /vsizip/vsicurl/https://github.com/zarr-developers/geozarr-spec/files/14214740/crs_enabled_zarr_and_tif.zip/some.zarr/
Driver: Zarr/Zarr
Files: /vsizip/vsicurl/https://github.com/zarr-developers/geozarr-spec/files/14214740/crs_enabled_zarr_and_tif.zip/some.zarr/some/.zarray
/vsizip/vsicurl/https://github.com/zarr-developers/geozarr-spec/files/14214740/crs_enabled_zarr_and_tif.zip/some.zarr/
Size is 8192, 8192
Coordinate System is:
PROJCRS["WGS 84 / NSIDC EASE-Grid 2.0 Global",
BASEGEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
MEMBER["World Geodetic System 1984 (G2296)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["US NSIDC EASE-Grid 2.0 Global",
METHOD["Lambert Cylindrical Equal Area",
ID["EPSG",9835]],
PARAMETER["Latitude of 1st standard parallel",30,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Longitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["False easting",0,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Environmental science - used as basis for EASE grid."],
AREA["World between 86°S and 86°N."],
BBOX[-86,-180,86,180]],
ID["EPSG",6933]]
Data axis to CRS axis mapping: 1,2
Origin = (400000.000000000000000,6000000.000000000000000)
Pixel Size = (10.000000000000000,-10.000000000000000)
Metadata:
AREA_OR_POINT=Area
grid_mapping=spatial_ref
Corner Coordinates:
Upper Left ( 400000.000, 6000000.000) ( 4d 8'44.40"E, 54d55'30.99"N)
Lower Left ( 400000.000, 5918080.000) ( 4d 8'44.40"E, 53d49'56.85"N)
Upper Right ( 481920.000, 6000000.000) ( 4d59'40.92"E, 54d55'30.99"N)
Lower Right ( 481920.000, 5918080.000) ( 4d59'40.92"E, 53d49'56.85"N)
Center ( 440960.000, 5959040.000) ( 4d34'12.66"E, 54d22'30.95"N)
Band 1 Block=1024x512 Type=Float32, ColorInterp=Undefined
NoData Value=nan
Metadata:
DIM_band_INDEX=0
Band 2 Block=1024x512 Type=Float32, ColorInterp=Undefined
NoData Value=nan
Metadata:
DIM_band_INDEX=1
Band 3 Block=1024x512 Type=Float32, ColorInterp=Undefined
NoData Value=nan
Metadata:
DIM_band_INDEX=2
(much) more to come while I get up to speed. I have realized that there's two things going on, Zarr now should be understood by GDAL better, there will certainly be fixes to be made in that software, and that will flow into QGIS (though there might be Q-specific features that also interact). There is plenty of Zarr in the world that won't have a transform, and GDAL will still work well with that. Having geo-zarr is really a new set of features for it to support, and I hope it stays aligned with xarray PR 9543 before constructing much new stuff.
I'm interested that the two examples above seem to have different ways of encoding the transform and crs, so I'll explore how that's done.
ah, I get it, the "_CRS" attribute name is defined here:
https://github.com/OSGeo/gdal/blob/f5076e774a8f43ec4ac7a6d8c234efe870dcc32a/frmts/zarr/zarr.h#L30
there is no "_TRANSFORM" attribute yet known to GDAL. Should it? Should it use rasterio transform ordering or GDAL transform ordering? (I'll explore this)
The _CRS is picked up by GDAL from the some.zarr, but not from the planet-fusion zarr, I don't know why yet.
The transform is not present in some.zarr, that is registered because GDAL has heuristics to determine the transform from the coordinates. But the planet-fusion zarr doesn't have these, and GDAL doesn't know about "_TRANSFORM".
(sorry a lot of this is maybe obvious but I'll keep notes as I pivot in in case they're helpful to others)
I don't think GDAL understands the ways that Zarr is using to store CRS. Trying this I see
gdalinfo "ZARR:\"/vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr\":/sr"
Driver: Zarr/Zarr
Files: /vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr/sr/.zarray
Size is 256, 256
Metadata:
_CRS=PROJCS["WGS 84 / UTM zone 14N",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-99],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32614"]]
_TRANSFORM={3,0,696000,0,-3,4536000}
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 256.0)
Upper Right ( 256.0, 0.0)
Lower Right ( 256.0, 256.0)
Center ( 128.0, 128.0)
Band 1 Block=32x32 Type=Int16, ColorInterp=Undefined
<snip>
The CRS is there but it's just metadata. There's something wrong with how GDAL is trained to pick this up. I'm seeing this repeatedly and have not yet found an example Zarr where the CRS is picked up, but I expect there are CF- grid_mapping examples that already work?
If it was working, without this "vrt://" workaround we would see "Coordinate System is: " as here:
gdalinfo "vrt://ZARR:\"/vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr\":/sr?a_srs=EPSG:32614" -nomd
Driver: VRT/Virtual Raster
Files: ZARR:"/vsizip//vsicurl/https://storage.googleapis.com/pl-amit-public/geozarr/planet-fusion.zarr.zip/planet-fusion.zarr":/sr
Size is 256, 256
Coordinate System is:
PROJCRS["WGS 84 / UTM zone 14N",
BASEGEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
MEMBER["World Geodetic System 1984 (G2296)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["UTM zone 14N",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-99,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Navigation and medium accuracy spatial referencing."],
AREA["Between 102°W and 96°W, northern hemisphere between equator and 84°N, onshore and offshore. Canada - Manitoba; Nunavut; Saskatchewan. Mexico. United States (USA)."],
BBOX[0,-102,84,-96]],
ID["EPSG",32614]]
Data axis to CRS axis mapping: 1,2
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 256.0)
Upper Right ( 256.0, 0.0)
Lower Right ( 256.0, 256.0)
Center ( 128.0, 128.0)
Band 1 Block=32x32 Type=Int16, ColorInterp=Undefined
Band 2 Block=32x32 Type=Int16, ColorInterp=Undefined
<snip>
I've been a while warming up to this, as mostly I was confused about how to actually reference Zarr stores without some software wrapper around it, also there's a quoting problem with ZARRs at a "/vsicurl" endpoint, where extra quotes that usually aren't needed make things work (something I'm trying to pin down to fix in GDAL itself). I'm not sure there's been any dialogue with GDAL itself, it seems mostly involved with xarray and rioxarray but I think there's some missing fundamentals here and I'll keep exploring.
There's also obviously no georeferencing here or coordinates on the data, but I see
"sr/.zattrs": {
"_ARRAY_DIMENSIONS": [
"datetime",
"band",
"row",
"col"
],
"_CRS": "PROJCS[\"WGS 84 / UTM zone 14N\",GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]],PROJECTION[\"Transverse_Mercator\"],PARAMETER[\"latitude_of_origin\",0],PARAMETER[\"central_meridian\",-99],PARAMETER[\"scale_factor\",0.9996],PARAMETER[\"false_easting\",500000],PARAMETER[\"false_northing\",0],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],AXIS[\"Easting\",EAST],AXIS[\"Northing\",NORTH],AUTHORITY[\"EPSG\",\"32614\"]]",
"_TRANSFORM": [
3.0,
0.0,
696000.0,
0.0,
-3.0,
4536000.0
]
so at very least we need to investigate exposing this information to GDAL. Has anyone investigated porting that in? And, whether, hopefully this kind of support goes into the Zarr spec itself? Is there efforts to match the xarray flexible coordinates of PR 9543 with Zarr?
so I think the problems is that GDAL is looking for a valid key in "_CRS" of either url, wkt, or projjson, or grid_mapping - but in this example is the value of the _CRS element itself (iiuc)
https://github.com/OSGeo/gdal/blob/b047461f3854bc9f0e105a458666ebf6a8e397d9/frmts/zarr/zarr_array.cpp#L2722
I'm not familiar enough with other cases but I'm going to keep exploring. I'm very dim on what ZARR itself accommodates and v2 vs v3, compared to what geozarr adds still.
This comment above makes reference to how GDAL handles quotes in paths. I recently created this issue which addresses some of the same concerns in QGIS / GDAL so I thought it should be linked here.
Closing as stale. No recent activity and likely no longer relevant. Can be reopened if needed.