harmonica
harmonica copied to clipboard
Allow NaN's in input grids for FFT filters
Description of the desired feature:
As pointed out by @RichardScottOZ in #377, it would be good to be able to apply the FFT transformation on grids that contain NaN's. I thought I'd share how I'm currently doing this, and start a discussion about if / how we could include it in the filters.
I'm filling the grid with either constant values (should be the median of the grid values to avoid edge effects), or with a nearest neighbor interpolation. Here's a few options:
filled = grid.fillna(10) # constant value
filled = grid.fillna(np.nanmedian(grid)) # median of grids value
filled = pygmt.grdfill(grid, mode="n", verbose="q") # pygmt nearest neighbor
filled = grid.rio.write_crs("epsg:4326" # rio needs crs set
).rio.set_spatial_dims(grid.dims[1], grid.dims[0] # rio needs dimension names set
).rio.interpolate_na(method="nearest")
df = vd.grid_to_table(grid)
df_dropped = df[df[grid.name].notna()]
coords = (df_dropped[grid.dims[1]], df_dropped[grid.dims[0]])
region = vd.get_region((df[grid.dims[1]], df[grid.dims[0]]))
filled = vd.KNeighbors(
).fit(coords, df_dropped[grid.name]
).grid(region=region, shape=grid.shape).scalars
Are there any other interpolation techniques I'm missing here?
This filled grid can be padded, passed to the FFT filters, unpadded, and then masked by the original grid. I'm using xr.where for this:
result = xr.where(grid.notnull(), unpadded_grid, grid)
Here is a function I use to combine all of this:
def filter_grid(
grid,
filter_width,
filt_type = "lowpass",
):
# get coordinate names
original_dims = grid.dims
# if there are nan's, fill them with nearest neighbor
if grid.isnull().any():
filled = grid.rio.write_crs("epsg:4326"
).rio.set_spatial_dims(original_dims[1], original_dims[0]
).rio.interpolate_na(method="nearest")
print("filling NaN's with nearest neighbor")
else:
filled = grid.copy()
# reset coord names back to originals
filled = filled.rename({
filled.dims[0]:original_dims[0],
filled.dims[1]:original_dims[1],
})
# define width of padding in each direction
pad_width = {
original_dims[1]: grid[original_dims[1]].size // 3,
original_dims[0]: grid[original_dims[0]].size // 3,
}
# apply padding
padded = xrft.pad(filled, pad_width)
if filt_type == "lowpass":
filt = hm.gaussian_lowpass(
padded,
wavelength = filter_width).rename("filt")
elif filt_type == "highpass":
filt = hm.gaussian_highpass(
padded,
wavelength = filter_width).rename("filt")
else:
raise ValueError("filt_type must be 'lowpass' or 'highpass'")
# unpad the grid
unpadded = xrft.unpad(filt, pad_width)
# reset coordinate values to original (avoid rounding errors)
unpadded = unpadded.assign_coords(
{
original_dims[0]: grid[original_dims[0]].values,
original_dims[1]: grid[original_dims[1]].values,
})
if grid.isnull().any():
result = xr.where(grid.notnull(), unpadded, grid)
else:
result = unpadded.copy()
np.testing.assert_equal(grid.easting.values, result.easting.values)
np.testing.assert_equal(grid.northing.values, result.northing.values)
return result
If we wanted to add this option to the transformations, I was thinking we could add the above filling and masking to the apply_filter() function, with an additional parameter: fill_value=None
.
fill_value
options would include:
- None (default, no filling, ValueError: Found nan(s) ... raised)
- Float (fill with constant value)
- Callable (np.nanmedian or equivalent, to fill the grid with)
- String: ("nearest"): use either Verde, PyGMT or Rio nearest neighbor interpolation. * pygmt or rio would require an additional dependency * if rio, would require optional CRS kwarg(default to EPSG:4326?)
fill_value
would then be added to each filter function as well.
What are everyone's thoughts on this? Related to #390 as well.
Are you willing to help implement and maintain this feature?