polartoolkit icon indicating copy to clipboard operation
polartoolkit copied to clipboard

Optimize `fetch.resample_grid`

Open mdtanker opened this issue 7 months ago • 1 comments

Description of the desired feature:

The function fetch.resample_grid is heavily used in PolarToolkit for changing a grid's region, spacing, or registration (pixel vs. gridline), or all of the above. During fetch calls of very large grids (e.g. BedMap3, Antarctic-wide and 500m resolution), this resampling is slow, and prone to crashing (maybe a separate issue). Here I'll try and lay out the possible options, and our current implementation, to see if there is a better option.

Current (PyGMT)

  • extract grid spacing, region, and registration
  • if only changing spacing
    • if spacing is being reduced, use pygmt.grdsample
    • if spacing is increasing, first filter grid and new spacing with pygmt.grdfilter, then use pygmt.grdsample.
  • if only changing region
    • change region to be nearest multiple of spacing
    • cut grid with pygmt.grdcut
  • if only registration changes
    • manually change registration with GMT grdedit
  • if changing multiple things
    • first cut grid to new region with pygmt.grdcut
    • then change spacing with pygmt.grdsample with pygmt.grdfilterer if reducing spacing
    • set registration during pygmt.grdsample call.

Rioxarray rio.reproject_match

  • updates 1 dataarray/dataset to match to spacing, region, and projection of another
  • how do know which is the grid to change?

Rioxarray

  • use rio.clip_box, rio.reproject(resolution=<<spacing>> in a scheme simial to the current PyGMT version.

Salem

  • lots of useful looking functions and grid attributes
    • check reference, region, and spacing of grid: grid.pixel_ref, grid.extent, grid.dx, grid.dy
    • change resolution (by setting shape not spacing): grid.regrid
    • change region: grid.subset
    • reproject a grid to match another and as variable: transform_and_add
    • reproject a grid: map_gridded_data

xarray-regrid

  • data accessor make code easy to implement
  • need to make grid to set the desired resolution with xarray_regrid.Grid
  • handle both finer and coarser resolutions

xESMF

  • similar to xarray-regrid, need to first create a grid with the desired region/spacing
  • auto-loops over variables of dataset
  • can save the regridder weights as a netcdf for future use! make regridding very fast

Pyresample

  • seems a bit heavy on boilerplate code, but if it's in the backend of PolarToolkit then that is fine.

mdtanker avatar May 16 '25 08:05 mdtanker

Examples of how to use some of these functions:

# make your grid of interest 
import verde as vd
import numpy as np
coords = vd.grid_coordinates(
    region=(-3330000.0, 3330000.0, -3330000.0, 3330000.0),
    spacing=500,
)
grid = vd.make_xarray_grid(
    coords,
    np.ones_like(coords[0]), data_names="z",
    dims =("y", "x"),
).z

# da is your actual data grid

# rioxarray
import rioxarray
da_regrid = da.rio.write_crs("epsg:3031").rio.reproject_match(
    grid.rio.write_crs("epsg:3031"),
).drop_vars("mapping")

# xarray regrid
import xarray_regrid
da_regrid = ice_thickness.regrid.linear(grid)

mdtanker avatar Jun 02 '25 16:06 mdtanker