miscalculation in `Interpolator._extrapolate_rows`
Hello!
Somehow I was recreating something similar to your modis5kmto1km function, when I realized we were building the same thing I thought let's compare and noticed some differences, so I deconstructed the code behind to understand the differences and I believe I found a major issue into it:
In the modis5kmto1km function I noticed at line 83 (on main) what I believe is the root of the issue:
satint.fill_borders("y", "x") => here
Somehow that line increases the dimensions of your self.tie_data entries containing the x, y, z coordinnates.
A typical 406 x 271 modis latitude 2d array becomes a 812 x 273. Going from 271 to 273 make sense to allow the interpolation of border but I can't get how going from 406 to 812 can make sense.
Especially considering the fact that your self.row_indices (which reflects the increment of the axis been used later in the RectBivariateSpline function), in cahnging in such a way as if it meant one pixel was added to the top border while 405 pixels are added at the bootom of the image.
The culprit to me is the _extrapolate_rows function from your Interpolator class which receives two rows as entry and systematically returns 4 while looping over the rows.
I hope I'm wrong, please correct me if so.
Note, the RectBivariateSpline function support the bbox parameter that would strongly simplify your code by allowing the extrapolation natively. https://docs.scipy.org/doc/scipy-1.15.0/reference/generated/scipy.interpolate.RectBivariateSpline.html
I've made more analysis on this, more info pointing toward an issue in the library, you would expect the center of the the 5 km pixel to stay at the same location after interpolation and I'm afraid I'm seeing quite some movement
while using the extrapolation from RectBivariateSpline with the bbox parameter and getting rid off satint.fill_borders("y", "x") is giving me
@Berhinj thanks a lot for performing this investigation! this is indeed a bit worrying.
I’m all for simplifying the code, because at the time we wrote this I don’t think the RectBivariateSpline was supporting extrapolation.
Also looking at the code (in satpy) that relies on interpolating modis navigation data, I can see that we use two functions depending if we have access to the satellite zenith angles or not:
- with zenith angles: https://github.com/pytroll/python-geotiepoints/blob/45ee914164cb3acd6605bf9d7a21c5f580abd691/geotiepoints/modisinterpolator.py
- without: https://github.com/pytroll/python-geotiepoints/blob/45ee914164cb3acd6605bf9d7a21c5f580abd691/geotiepoints/simple_modis_interpolator.py
So that makes three (!!!) ways of interpolating modis navigation data.
So back to the function you are using, I think we need to make some serious analysis, find the best alternative, and deprecate what is not performing well. Or use your fix if that function for interpolating still makes sense.
Do you have the possibility to compare with eg https://github.com/pytroll/python-geotiepoints/blob/45ee914164cb3acd6605bf9d7a21c5f580abd691/geotiepoints/modisinterpolator.py#L49 and see what the results are?
Also could you detail a bit more what data you are using for testing? eg could terrain correction be an issue?
@mraspaud
I'm getting the data through earthaccess nasa package 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/MOD11_L2.061/MOD11_L2.A2023177.1105.061.2023178090735/MOD11_L2.A2023177.1105.061.2023178090735.hdf'
here is an example of how I was loading the data and running the modis function
import rioxarray as rxr
# you'll need to run this for hdf4 support conda install -c conda-forge libgdal-hdf4
modis_hdf_subpath = f"HDF4_EOS:EOS_SWATH:{modis_path}:MOD_Swath_LST:"
modis_lst_subpath = modis_hdf_subpath + "LST"
modis_lat_subpath = modis_hdf_subpath + "Latitude"
modis_lon_subpath = modis_hdf_subpath + "Longitude"
# Get lst data
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=NotGeoreferencedWarning)
lst_da = rxr.open_rasterio(modis_lst_subpath).squeeze("band")
valid_min, valid_max = map(int, lst_da.attrs["valid_range"].split(','))
lst_da = lst_da.rio.update_attrs({
"scale_factor": lst_da.attrs["_Scale"],
"add_offset": lst_da.attrs["_Offset"],
"valid_min": valid_min,
"valid_max": valid_max,
})
# Masking & Scaling
lst_da = lst_da.where((lst_da >= lst_da.attrs["valid_min"]) & (lst_da <= lst_da.attrs["valid_max"]))
lst_da.data = (lst_da.data + lst_da.attrs["add_offset"]) * lst_da.attrs["scale_factor"]
lst_da = lst_da.rio.write_nodata(np.nan)
# Read latitudes and longitudes
latitudes_da = rxr.open_rasterio(modis_lat_subpath).squeeze("band")
longitudes_da = rxr.open_rasterio(modis_lon_subpath).squeeze("band")
# Fixing the indexes
lst_da["x"] = np.arange(lst_da.sizes["x"], dtype=np.uint16)
lst_da["y"] = np.arange(lst_da.sizes["y"], dtype=np.uint16)
modis5kmto1km(longitudes_da.values, latitudes_da.values)
I will try the other modis functions
edit1: I tried from geotiepoints.simple_modis_interpolator import interpolate_geolocation_cartesian, values I'm getting are very off, but it doesn't sound desinged for 5 to 1km interpolation - fair
One note in favour of your existing code, somehow with my logic I'm getting these diagonal artifacts that you don't have
they get solved once modis5kmto1km is calling the satint.fill_borders("y", "x") function. Our results were the same until there.
according to modis user guide, stripping artefact are expect but I'm lacking info on the topic (page 18)
https://modis-land.gsfc.nasa.gov/pdf/MOD21_LST&E_user_guide_C6_gch_10252017.pdf
@Berhinj I know it has been some time, but any idea where we left off with this. I don't even recall this issue being created (sorry), but your last comment makes it sound like (some of) geotiepoints modis interpolation methods work as expected or at least the output is acceptable. Please correct me if I'm wrong. If we can close this, let's close it. If results are still worrying then we should maybe still dive into it.