xarray
xarray copied to clipboard
Y-axis is reversed when using to_zarr()
What happened?
When I export a dataset to NetCDF and Zarr, the y axis appears to have been reversed with gdalinfo. I also cannot build a vrt file with the Zarr file since it complains about positive NS axis, but this works fine with the NetCDF file.
Example NetCDF file as input: in.nc.zip
gdalinfo on output NetCDF file:
$ gdalinfo NETCDF:out.nc:air_temperature_2m
Driver: netCDF/Network Common Data Format
Files: out.nc
out.nc.aux.xml
Size is 949, 1069
Coordinate System is:
PROJCRS["unnamed",
BASEGEOGCRS["unknown",
DATUM["unnamed",
ELLIPSOID["Sphere",6371000,0,
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433,
ID["EPSG",9122]]]],
CONVERSION["unnamed",
METHOD["Lambert Conic Conformal (2SP)",
ID["EPSG",9802]],
PARAMETER["Latitude of false origin",63.3,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8821]],
PARAMETER["Longitude of false origin",15,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8822]],
PARAMETER["Latitude of 1st standard parallel",63.3,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Latitude of 2nd standard parallel",63.3,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8824]],
PARAMETER["Easting at false origin",0,
LENGTHUNIT["metre",1],
ID["EPSG",8826]],
PARAMETER["Northing at false origin",0,
LENGTHUNIT["metre",1],
ID["EPSG",8827]]],
CS[Cartesian,2],
AXIS["easting",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["northing",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
Data axis to CRS axis mapping: 1,2
Origin = (-1061334.000000000000000,1338732.125000000000000)
Pixel Size = (2500.000000000000000,-2500.000000000000000)
Metadata:
air_temperature_2m#coordinates=longitude latitude
air_temperature_2m#grid_mapping=projection_lambert
air_temperature_2m#long_name=Screen level temperature (T2M)
air_temperature_2m#standard_name=air_temperature
air_temperature_2m#units=K
air_temperature_2m#_FillValue=9.96921e+36
height1#description=height above ground
height1#long_name=height
height1#positive=up
height1#units=m
height1#_FillValue=nan
NC_GLOBAL#coordinates=projection_lambert time
NETCDF_DIM_EXTRA={height1}
NETCDF_DIM_height1_DEF={1,5}
NETCDF_DIM_height1_VALUES=2
projection_lambert#earth_radius=6371000
projection_lambert#grid_mapping_name=lambert_conformal_conic
projection_lambert#latitude_of_projection_origin=63.3
projection_lambert#longitude_of_central_meridian=15
projection_lambert#standard_parallel={63.3,63.3}
x#long_name=x-coordinate in Cartesian system
x#standard_name=projection_x_coordinate
x#units=m
x#_FillValue=nan
y#long_name=y-coordinate in Cartesian system
y#standard_name=projection_y_coordinate
y#units=m
y#_FillValue=nan
Geolocation:
LINE_OFFSET=0
LINE_STEP=1
PIXEL_OFFSET=0
PIXEL_STEP=1
SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
X_BAND=1
X_DATASET=NETCDF:"out.nc":longitude
Y_BAND=1
Y_DATASET=NETCDF:"out.nc":latitude
Corner Coordinates:
Upper Left (-1061334.000, 1338732.125) ( 18d10'24.02"W, 72d45'59.56"N)
Lower Left (-1061334.000,-1333767.875) ( 0d15'55.60"E, 50d18'23.10"N)
Upper Right ( 1311166.000, 1338732.125) ( 54d17'24.85"E, 71d34'43.38"N)
Lower Right ( 1311166.000,-1333767.875) ( 33d 2'20.10"E, 49d45' 6.51"N)
Center ( 124916.000, 2482.125) ( 17d30' 3.21"E, 63d18' 1.50"N)
Band 1 Block=949x1069 Type=Float32, ColorInterp=Undefined
Min=236.480 Max=284.937
Minimum=236.480, Maximum=284.937, Mean=269.816, StdDev=9.033
NoData Value=9.96920996838686905e+36
Unit Type: K
Metadata:
coordinates=longitude latitude
grid_mapping=projection_lambert
long_name=Screen level temperature (T2M)
NETCDF_DIM_height1=2
NETCDF_VARNAME=air_temperature_2m
standard_name=air_temperature
STATISTICS_MAXIMUM=284.93682861328
STATISTICS_MEAN=269.81614967971
STATISTICS_MINIMUM=236.47978210449
STATISTICS_STDDEV=9.0332172122638
units=K
_FillValue=9.96921e+36
gdalinfo on output Zarr file:
$ gdalinfo ZARR:out.zarr:/air_temperature_2m:0
Driver: Zarr/Zarr
Files: none associated
Size is 949, 1069
Origin = (-1061334.000000000000000,-1333767.875000000000000)
Pixel Size = (2500.000000000000000,2500.000000000000000)
Metadata:
coordinates=longitude latitude
grid_mapping=projection_lambert
long_name=Screen level temperature (T2M)
standard_name=air_temperature
Corner Coordinates:
Upper Left (-1061334.000,-1333767.875)
Lower Left (-1061334.000, 1338732.125)
Upper Right ( 1311166.000,-1333767.875)
Lower Right ( 1311166.000, 1338732.125)
Center ( 124916.000, 2482.125)
Band 1 Block=475x268 Type=Float32, ColorInterp=Undefined
NoData Value=9.96920996838686905e+36
Unit Type: K
The main issue is that the origin and y-axis direction is reversed, as you can see from the origin and pixel size. I have tried taking the CRS from the netcdf and adding it to the Zarr file as a _CRS attribute manually, but this doesn't make any difference to the origin or pixel size.
What did you expect to happen?
Origin, pixel size and corner coords should match those in the netcdf file.
$ gdalinfo ZARR:out.zarr:/air_temperature_2m:0
Driver: Zarr/Zarr
Files: none associated
Size is 949, 1069
Origin = (-1061334.000000000000000,1338732.125000000000000)
Pixel Size = (2500.000000000000000,-2500.000000000000000)
Metadata:
coordinates=longitude latitude
grid_mapping=projection_lambert
long_name=Screen level temperature (T2M)
standard_name=air_temperature
Corner Coordinates:
Corner Coordinates:
Upper Left (-1061334.000, 1338732.125) ( 18d10'24.02"W, 72d45'59.56"N)
Lower Left (-1061334.000,-1333767.875) ( 0d15'55.60"E, 50d18'23.10"N)
Upper Right ( 1311166.000, 1338732.125) ( 54d17'24.85"E, 71d34'43.38"N)
Lower Right ( 1311166.000,-1333767.875) ( 33d 2'20.10"E, 49d45' 6.51"N)
Center ( 124916.000, 2482.125) ( 17d30' 3.21"E, 63d18' 1.50"N)
Band 1 Block=475x268 Type=Float32, ColorInterp=Undefined
NoData Value=9.96920996838686905e+36
Unit Type: K
Minimal Complete Verifiable Example
import xarray as xr
from pyproj import CRS
ds = xr.open_dataset("in.nc")
# Optionally take copy CRS to Zarr (produces and error, but does work)
crs_wkt = CRS.from_cf(ds["projection_lambert"].attrs).to_wkt()
ds["air_temperature_2m"] = ds["air_temperature_2m"].assign_attrs(_CRS={"wkt": crs_wkt})
ds.to_zarr("out.zarr")
ds.to_netcdf("out.nc")
MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
No response
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None python: 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] python-bits: 64 OS: Linux OS-release: 6.5.0-15-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.3-development
xarray: 2024.1.1 pandas: 2.2.0 numpy: 1.26.4 scipy: None netCDF4: 1.6.5 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.1 cftime: 1.6.3 nc_time_axis: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: None mypy: None IPython: None sphinx: None
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!
Thanks for documenting this @alistaireverett, it is a bit surprising. Interestingly the default output Zarr is identical in memory to the starting netCDF dataset:
ds = xr.open_dataset("in.nc")
ds.to_zarr("out.zarr")
dz = xr.open_zarr("out.zarr")
xr.testing.assert_identical(ds, dz) #True
Also bypassing Xarray entirely and just using GDAL still has this flipped y-axis, so I'm thinking there is some issue with in.nc rather than Xarray's to_zarr() method.
gdalmdimtranslate -of ZARR in.nc out-gdal.zarr
gdalinfo ZARR:"out-gdal.zarr":/air_temperature_2m
# Origin = (-1061334.000000000000000,-1333767.875000000000000)
# Pixel Size = (2500.000000000000000,2500.000000000000000)
Can you share a bit more about where in.nc comes from and check on the metadata structure? https://github.com/cedadev/cf-checker could be useful here, for example I see a few warnings, GDAL can have trouble if you have a non-compliant NetCDF (https://github.com/OSGeo/gdal/issues/8477):
CHECKING NetCDF FILE: in.nc
=====================
Using CF Checker Version 4.1.0
Checking against CF Version CF-1.8
Using Standard Name Table Version 84 (2024-01-19T15:55:10Z)
Using Area Type Table Version 11 (06 July 2023)
Using Standardized Region Name Table Version 4 (18 December 2018)
INFO: Attribute coordinates is being used in a non-standard way; as a global attribute. (See Appendix A)
WARN: (2.6.1): No 'Conventions' attribute present
Closing as upstream; please reopen with an MCVE if that's not correct