pyresample icon indicating copy to clipboard operation
pyresample copied to clipboard

Add conservative resampler

Open zxdawn opened this issue 3 years ago • 12 comments

The conservative resampler is

The value of a destination cell is calculated as the weighted sum of the values of the source cells that it overlaps.

according to ESMPy.

It's also supported in xesmf.

If we can add it to pyresample, it would be nice.

zxdawn avatar Jul 03 '22 21:07 zxdawn

Would you want to use this in satpy? If satpy could access the implementation in xesmf would that be good enough? Would you be using xarray objects with it?

Also, there are some weighted resampling functions already in numpy-based parts of pyresample. Also EWA, which is in pyresample, is a weighted average but requires the data to be scan based (ex. Polar-orbiters).

djhoese avatar Jul 04 '22 02:07 djhoese

Yes, I wanna use it with satpy. I have run it successfully by converting the Scene to Dataset and regridding it by xESMF. Because xESMF needs specific lon/lat/lon_b/lat_b variables, it's harder to access it in Satpy.

And, I tried to reproduce xESMF results using EWA resampler, but the result looks strange. Here's the comparison notebook

zxdawn avatar Jul 04 '22 20:07 zxdawn

Here's the concept of conservative resampler which considers the weighted average:

image

zxdawn avatar Jul 05 '22 13:07 zxdawn

For EWA, you'd want to play with these parameters:

https://github.com/pytroll/pyresample/blob/74eb088c14c15a524721b9e896073744f4234133/pyresample/ewa/dask_ewa.py#L512-L519

Set weight_distance_max to 2.0 and weight_delta_max to 40.0 as a starting point. That's what I use for VIIRS data.

EWA definitely only uses the "center" points of the pixels. It assumes a pixel size based on the curvature of the earth and the rows per scan. So if tropomi isn't scan based then what you've done with setting rows_per_scan to the total number of rows is "good enough", but also means EWA is less likely to make pretty results compared to other algorithms.

We don't have any resampling algorithms in pyresample that use other information beyond the center of the pixel. I think the best route forward would be to add functionality in satpy to allow for this resampler to be used from xesmf. If you could make it have nice defaults for the bounding coordinate information so that sensors without this information could still use it that'd be really cool, but an exception being raised would be fine with me too.

djhoese avatar Jul 05 '22 15:07 djhoese

Got it. For the new conservative resampler, here's my idea: If lon_b and lat_b are not available, we can create them following my notebook method which uses xgcm. I suppose satpy-supported datasets usually just have center points?

Oh ... Maybe it's better to make xESMF create the lon_b and lat_b automatically if they don't exist. Then, it would be easy to call it in pyresample.

zxdawn avatar Jul 05 '22 15:07 zxdawn

If it makes mathematical sense to compute the bounds from the center points then sure that seems interesting. I see your xesmf issue above. Another option is we could make SwathDefinitions be smarter about additional metadata (like rows_per_scan or bounding information) and have methods on it for returning or generating that information from the information contained in the SwathDefinition.

djhoese avatar Jul 05 '22 15:07 djhoese

I was just working these days on something related to that and I have made the following thoughts.

I wonder if we could add methods to AreaDef and SwathDef to retrieve:

  • cell centers arrays (N, M) --> get_lonlats() [ALREADY AVAILABLE]
  • cell corners arrays (N+1, M+1) --> get_lonlats_corners() [MISSING]
  • cell bounds arrays (N, M, nv) --> get_lonlats_bounds() [MISSING]

In addition to being useful for conservative remapping with xESMF, these methods could be used to develop the following ones:

  • SwathDefinition.upsample
  • SwathDefinition.extend
  • area = area_def.get_cell_area() # swath_def.get_cell_area()
  • perimeter = area_def.get_cell_perimeter() # swath_def.get_cell_perimeter()
  • res_x, res_y = get_cell_resolutions() # swath_def.get_cell_resolutions()

For pyresample structured grids, the corner and bounds would be guessed assuming equal spacing on either side of the coordinate labels. Then the question would be if to provide the option to infer the spacing in

  • lat/lon projection
  • the AreaDef projection
  • In geocentric x,y,z coordinates (which would be the default solution I guess)

What do you think?

ghiggi avatar Jul 06 '22 12:07 ghiggi

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

zxdawn avatar Jul 06 '22 12:07 zxdawn

@ghiggi When you say "For pyresample structured grids", do you mean an AreaDefinition? Then I think any spacing decisions should be in the area definitions projection space. For SwathDefinitions I think you have to do it in geocentric.

djhoese avatar Jul 06 '22 13:07 djhoese

@djhoese ... it was my natural mental separation from resampling approaches related to spherical unstructured grids. In such a case the (node) coordinate is 1D, and the simplest way to retrieve "a mesh" is to use SphericalVoronoi. But depending on the knowledge of the background spherical sampling, the mesh can be inferred for the Healpix, (Reduced) Gaussian Legendre, Icosahedral, Octahedral and Equiangular samplings. I think it goes outside the pyresample scope. If one day you want to include also the unstructured grids, I have some code for this scope.

I can draft the methods described above in the coming days :D

ghiggi avatar Jul 06 '22 16:07 ghiggi

Sure. Yeah. I understood that. :confused: <whoosh emoji>

djhoese avatar Jul 06 '22 17:07 djhoese

The idea of get_lonlats_corners() and get_lonlats_bounds() is good. I tried to dig deeper how xgcm calculates the boundary for 2D coordinates ... And, I found C-grid but then lost in the codes .... cc @jbusecke

I have just mentioned that in https://github.com/xarray-contrib/cf-xarray/issues/71 (which might have some interesting overlap with this disussion? Even though I am not sure CF conventions are general enough for the problems described here), but I think the ability to infer grid cell bounds/corners(nodes) should really never have been in xgcm (blame a green OSS contributor named Julius for not realizing that back then...).

Xgcm should consume fully defined finite grids, not produce/infer them. As I mentioned over in the above issue, I am planning on removing the 'autogenerate' functionality from xgcm, but a general replacement would be very beneficial it seems for the broader community.

jbusecke avatar Jul 07 '22 16:07 jbusecke