harmonica icon indicating copy to clipboard operation
harmonica copied to clipboard

Allow NaN's in input grids for FFT filters

Open mdtanker opened this issue 1 year ago • 19 comments

Description of the desired feature:

As pointed out by @RichardScottOZ in #377, it would be good to be able to apply the FFT transformation on grids that contain NaN's. I thought I'd share how I'm currently doing this, and start a discussion about if / how we could include it in the filters.

I'm filling the grid with either constant values (should be the median of the grid values to avoid edge effects), or with a nearest neighbor interpolation. Here's a few options:

xarray.DataArray.fillna()

 filled = grid.fillna(10)   # constant value
 filled = grid.fillna(np.nanmedian(grid))   # median of grids value

pygmt.grdfill()

 filled = pygmt.grdfill(grid, mode="n", verbose="q")   # pygmt nearest neighbor

rioxarray.interpolate_na()

filled = grid.rio.write_crs("epsg:4326"  # rio needs crs set
     ).rio.set_spatial_dims(grid.dims[1], grid.dims[0]   # rio needs dimension names set
     ).rio.interpolate_na(method="nearest")   

verde.KNeighbors()

df = vd.grid_to_table(grid)
df_dropped = df[df[grid.name].notna()]
coords = (df_dropped[grid.dims[1]], df_dropped[grid.dims[0]])
region = vd.get_region((df[grid.dims[1]], df[grid.dims[0]]))
filled = vd.KNeighbors(
            ).fit(coords, df_dropped[grid.name]
            ).grid(region=region, shape=grid.shape).scalars

Are there any other interpolation techniques I'm missing here?

This filled grid can be padded, passed to the FFT filters, unpadded, and then masked by the original grid. I'm using xr.where for this:

result = xr.where(grid.notnull(), unpadded_grid, grid)

Here is a function I use to combine all of this:


def filter_grid(
    grid,
    filter_width,
    filt_type = "lowpass",
):
    # get coordinate names
    original_dims = grid.dims

    # if there are nan's, fill them with nearest neighbor
    if grid.isnull().any():
        filled = grid.rio.write_crs("epsg:4326"
             ).rio.set_spatial_dims(original_dims[1], original_dims[0]
             ).rio.interpolate_na(method="nearest")
        print("filling NaN's with nearest neighbor")
    else:
        filled = grid.copy()

    # reset coord names back to originals
    filled = filled.rename({
        filled.dims[0]:original_dims[0],
        filled.dims[1]:original_dims[1],
        })

    # define width of padding in each direction
    pad_width = {
        original_dims[1]: grid[original_dims[1]].size // 3,
        original_dims[0]: grid[original_dims[0]].size // 3,
    }

    # apply padding
    padded = xrft.pad(filled, pad_width)

    if filt_type == "lowpass":
        filt = hm.gaussian_lowpass(
            padded,
            wavelength = filter_width).rename("filt")
    elif filt_type == "highpass":
        filt = hm.gaussian_highpass(
            padded,
            wavelength = filter_width).rename("filt")
    else:
        raise ValueError("filt_type must be 'lowpass' or 'highpass'")

    # unpad the grid
    unpadded = xrft.unpad(filt,  pad_width)

    # reset coordinate values to original (avoid rounding errors)
    unpadded = unpadded.assign_coords(
        {
            original_dims[0]: grid[original_dims[0]].values,
            original_dims[1]: grid[original_dims[1]].values,
        })

    if grid.isnull().any():
        result = xr.where(grid.notnull(), unpadded, grid)
    else:
        result = unpadded.copy()

    np.testing.assert_equal(grid.easting.values, result.easting.values)
    np.testing.assert_equal(grid.northing.values, result.northing.values)

    return result

If we wanted to add this option to the transformations, I was thinking we could add the above filling and masking to the apply_filter() function, with an additional parameter: fill_value=None.

fill_value options would include:

  • None (default, no filling, ValueError: Found nan(s) ... raised)
  • Float (fill with constant value)
  • Callable (np.nanmedian or equivalent, to fill the grid with)
  • String: ("nearest"): use either Verde, PyGMT or Rio nearest neighbor interpolation. * pygmt or rio would require an additional dependency * if rio, would require optional CRS kwarg(default to EPSG:4326?)

fill_value would then be added to each filter function as well.

What are everyone's thoughts on this? Related to #390 as well.

Are you willing to help implement and maintain this feature?

mdtanker avatar Mar 15 '23 22:03 mdtanker