xarray_leaflet icon indicating copy to clipboard operation
xarray_leaflet copied to clipboard

Zarr support

Open christophenoel opened this issue 3 years ago • 20 comments

Hello,

Looks very interesting. But I can't manage to make it working with a zarr store actually. It seems rioxarray is not compliant with zarr format. Thanks for any tip.

christophenoel avatar Oct 18 '21 10:10 christophenoel

What do you mean with "rioxarray not compliant with zarr format"? After loading your zarr array, you should set the CRS and nodata if they are not already set:

davidbrochart avatar Oct 18 '21 10:10 davidbrochart

When I try opening the Zarr store,

store = s3fs.S3Map(root='my.zarr/', s3=myBucket, check=False)
da = rioxarray.open_rasterio(store, masked=True)

It states TypeError: unhashable type: 'FSMap'. Maybe this requires the file to be local ?

christophenoel avatar Oct 18 '21 10:10 christophenoel

Maybe you should open the Zarr store with xarray (replace gcs with s3):

ds_gcs = xr.open_dataset(
    "gcs://<bucket-name>/path.zarr",
    backend_kwargs={
        "storage_options": {"project": "<project-name>", "token": None}
    },
    engine="zarr",
)

davidbrochart avatar Oct 18 '21 10:10 davidbrochart

All apologies ! I concluded from your example that rasterio was mandatory.

I will try then to see why I can't manage to make it work using xarray. It seems I first have to rename lat/lon to x/y. Thanks.

christophenoel avatar Oct 18 '21 12:10 christophenoel

I concluded from your example that rasterio was mandatory.

You're right, it is mandatory to set the CRS and nodata, but this can be done after loading your array.

I will try then to see why I can't manage to make it work using xarray. It seems I first have to rename lat/lon to x/y. Thanks.

You can pass e.g. plot(m, x_dim="lon", y_dim="lat").

davidbrochart avatar Oct 18 '21 13:10 davidbrochart

Thanks so much for your support. However, I have tried with the code below, and this is unsucessful: nothing is displayed, and after a while a javascript error is shown ():

rf = dataset['reflectance'] 
r2 = rf.isel(wavelength=0,latitude=slice(0,1000),longitude=slice(0,1000))
r2.plot.pcolormesh(cmap='terrain') 
r2.rio.write_crs("EPSG:32630",inplace=True)
r2.rio.write_nodata(0, inplace=True)
#r2.chunk((100, 100))
#warnings.filterwarnings("ignore")
m = Map(center=[40, 115], zoom=3, basemap=basemaps.CartoDB.DarkMatter, interpolation='nearest')
m
r2.leaflet.plot(m, x_dim="longitude", y_dim="latitude", colormap=plt.cm.terrain)

christophenoel avatar Oct 19 '21 09:10 christophenoel

Could you post the output of r2?

davidbrochart avatar Oct 19 '21 09:10 davidbrochart

As you have noticed, I'm a python newbie. Here it is:

<xarray.DataArray 'reflectance' (latitude: 1000, longitude: 1000)>
dask.array<getitem, shape=(1000, 1000), dtype=uint16, chunksize=(256, 256), chunktype=numpy.ndarray>
Coordinates:
  * latitude     (latitude) float64 4.406e+06 4.406e+06 ... 4.436e+06 4.436e+06
  * longitude    (longitude) float64 2.477e+05 2.477e+05 ... 2.776e+05 2.777e+05
    wavelength   float64 402.4
    spatial_ref  int64 0
Attributes:
    addoffset:    0.0
    scalefactor:  1.5259021896696422e-05
    units:        reflectance
    _FillValue:   0

Latitude: array([4405572.5, 4405602.5, 4405632.5, ..., 4435482.5, 4435512.5, 4435542.5]) Longitude: array([247692.484375, 247722.484388, 247752.4844 , ..., 277602.496918, 277632.49693 , 277662.496943])

christophenoel avatar Oct 19 '21 09:10 christophenoel

As you have noticed, I'm a python newbie

No worries, we've all been there :smile:

  • latitude (latitude) float64 4.406e+06 4.406e+06 ... 4.436e+06 4.436e+06
  • longitude (longitude) float64 2.477e+05 2.477e+05 ... 2.776e+05 2.777e+05

These don't seem to be in degrees, right? What is the coordinate system for your array?

davidbrochart avatar Oct 19 '21 09:10 davidbrochart

This uses the UTM 30N (EPSG 32630) projection: https://epsg.io/32630

christophenoel avatar Oct 19 '21 10:10 christophenoel

I tried after reprojecting to EPSG:4326 using pyproj:

from pyproj import Transformer

r3 = rf.isel(wavelength=0,latitude=slice(0,1000),longitude=slice(0,1000))

Retrieve dataset and coordinates

lat = r3.latitude.data
lon = r3.longitude.data

# Reproject coordinates
transformer = Transformer.from_crs(32630, 4326) # UTM30 to EPSG 4326
latlon_reprojected = transformer.transform(lon,lat) #Reproject lat lon
new_lon = latlon_reprojected[0]
new_lat = latlon_reprojected[1]

# Assign reprojected coordinates
r3 = r3.assign_coords(latitude=new_lat)
r3 = r3.assign_coords(longitude=new_lon)
r3 = r3.assign_coords({"longitude": new_lon, "latitude": new_lat})

This generates:

<xarray.DataArray 'reflectance' (latitude: 1000, longitude: 1000)>
dask.array<getitem, shape=(1000, 1000), dtype=uint16, chunksize=(256, 256), chunktype=numpy.ndarray>
Coordinates:
  * latitude     (latitude) float64 -5.945 -5.945 -5.945 ... -5.606 -5.606
  * longitude    (longitude) float64 39.76 39.76 39.76 ... 40.04 40.04 40.04
    wavelength   float64 402.4
    spatial_ref  int64 0
Attributes:
    addoffset:    0.0
    scalefactor:  1.5259021896696422e-05
    units:        reflectance
    _FillValue:   0

The javascript errors are still the same:

272.9d70a85a9abe209402d0.js:1 Error: Could not create a view for model id 7d81b0bb761542018f0594122b52f405
272.9d70a85a9abe209402d0.js:1 Error: Could not create child view
138.dd4ad020192640a8ffae.js:1 Uncaught (in promise) TypeError: Cannot read properties of undefined (reading '_fadeAnimated')

Note that the CRS attribute (epsg:4326) seems not present anywhere in the object. I assume rasterio expects x,y variables.

EDIT:

I tried setting the dim_x / dim_Y also, but still an error.

r3.rio.set_spatial_dims("longitude","latitude",inplace=True)
r3.rio.write_crs("EPSG:4326",inplace=True)
r3.rio.write_nodata(0, inplace=True)

Result array:

<xarray.DataArray 'reflectance' (latitude: 1000, longitude: 1000)>
dask.array<getitem, shape=(1000, 1000), dtype=uint16, chunksize=(256, 256), chunktype=numpy.ndarray>
Coordinates:
  * latitude     (latitude) float64 -5.945 -5.945 -5.945 ... -5.606 -5.606
  * longitude    (longitude) float64 39.76 39.76 39.76 ... 40.04 40.04 40.04
    wavelength   float64 402.4
    spatial_ref  int64 0
Attributes:
    addoffset:    0.0
    scalefactor:  1.5259021896696422e-05
    units:        reflectance
    _FillValue:   0

christophenoel avatar Oct 19 '21 11:10 christophenoel

It looks like you have a widget issue. Can you try installing in a fresh environment? I recommend using conda.

davidbrochart avatar Oct 19 '21 12:10 davidbrochart

Thanks for the tip. I have tried from Binder on your repo and locally from the conda env below, but this doesn't change anything.

name: zarr-leaflet
channels:
  - conda-forge
dependencies:
  - requests
  - tqdm
  - scipy
  - dask
  - rasterio=1.2.6
  - jupyterlab=3
  - ipykernel
  - xarray_leaflet=0.1.15
  - s3fs
  - fsspec
  - zarr
  - pyproj

christophenoel avatar Oct 21 '21 14:10 christophenoel

Hard to say without the code.

davidbrochart avatar Oct 21 '21 15:10 davidbrochart

You're right. I have created a repo and a binder:

https://github.com/christophenoel/zarr-leaflet

christophenoel avatar Oct 22 '21 07:10 christophenoel

Thanks, it looks like you still don't have lat/lon in degrees:

Coordinates:
    latitude     (y) float64 4.406e+06 4.406e+06 ... 4.436e+06 4.436e+06
    longitude    (x) float64 2.477e+05 2.477e+05 ... 2.776e+05 2.777e+05
    wavelength   float64 402.4
    spatial_ref  int64 0

davidbrochart avatar Oct 22 '21 07:10 davidbrochart

Sorry, I made so many tests that I introduced an error. I added file zarr-leaflet-4326.ipynb which reproject to EPSG:4326.

christophenoel avatar Oct 22 '21 07:10 christophenoel

For info, I have also added zarr-xy-leaflet.ipynb : in this NB I convert dims and coordinates name to x,y then use rioxarray to reproject the array. Same problem though... ;)

christophenoel avatar Oct 22 '21 14:10 christophenoel

Hello David,

Have you had some time to have a look to my binder? Any idea of the possible cause ? Maybe the extend is wrong or should be specificed ?

Binder

As you can see in zarr-xy-leaflet.ipynb, the coordinates look good:

Coordinates:
  * x            (x) float64 -5.945 -5.945 -5.945 ... -5.512 -5.511 -5.511
  * y            (y) float64 39.77 39.77 39.77 39.77 ... 39.76 39.76 39.76 39.76
    wavelength   float64 402.4
    spatial_ref  int64 0
Attributes:
    addoffset:    0.0
    scalefactor:  1.5e-05
    units:        reflectance
    _FillValue:   11

EDIT: I also tried by reverting the y order and it still fails.

r3.rio.bounds(recalc=True)
(-5.945656524967464, 39.762457046537364, -5.510672449086092, 39.77470519026692)

Thanks.

christophenoel avatar Nov 04 '21 14:11 christophenoel

Hi Christophe, sorry I didn't have time to look at it. I don't know why it doesn't work.

davidbrochart avatar Nov 04 '21 15:11 davidbrochart