xdem
xdem copied to clipboard
`xdem.coreg.ICP.fit_pts` fails for unknown reason with a regular dataset
As promised in #422 , here's the other issue hehe.
This is a strange one. I already have a fix, but I don't understand why it helps!
Problem description
When running ICP.fit_pts()
I ran into the issue that led to #422; my points were all filtered away for some reason. After some debugging, I found where this happens, but I still don't know why.
First, rasters containing information on the normals are sampled using regular Raster.interp_points
calls:
https://github.com/GlacioHack/xdem/blob/3956073ef6c1846ffc6de3ac5368269f45131f42/xdem/coreg/affine.py#L525-L529
Then, when the rasters have been sampled, a convoluted nan removal line removes the unsuccessful samples: https://github.com/GlacioHack/xdem/blob/3956073ef6c1846ffc6de3ac5368269f45131f42/xdem/coreg/affine.py#L536-L537
This led to my ref_dem being empty, as the above interp_points
for some reason only returned nans! I figured out a fix for this problem, but I still don't understand the symptom...
The solution
Instead of the sampling in the code (shown above), I changed it to:
normal_east.tags["AREA_OR_POINT"] = "Area"
rowcol = np.round(np.dstack(normal_east.xy2ij(ref_dem["E"], ref_dem["N"], shift_area_or_point=True)).squeeze()).astype(int)
if any(col not in ref_dem for col in ["nx", "ny", "nz"]):
for key, raster in [("nx", normal_east), ("ny", normal_north), ("nz", normal_up)]:
ref_dem[key] = raster.data[rowcol[:, 0], rowcol[:, 1]].filled(np.nan)
Basically, I make a more efficient nearest neighbour sampling by first determining the row and column indices, then directly indexing the underlying raster. This should be much more efficient, as that part is only done once instead of three times, and for some reason it fixes my problem with the sampling.
What's the actual problem?
Without a minimal (non-)working example, it's hard for me to figure out exactly what's wrong... Honestly, I think it boils down to one of two potential issues:
- There's something wrong with
Raster.interp_points
in geoutils - There's something wrong with my use of the functionality
Honestly, and argument for making a PR to "fix" this can simply be motivated by performance. If Raster.interp_points
is not used in the method, no Raster
s have to be instantiated and I can just use the rasterio coords to ij functions. That'd reduce the amount of copying that might happen and it will probably make the code look cleaner. I might open a PR to improve the code and then add an associated geoutils issue for tackling the underlying problem.
Still, it would be great to get to the bottom of why this actually fails... @rhugonnet or @adehecq, have you experienced something similar before with this function?
Also, @rhugonnet and @adehecq, if you have any recommendations of how to make a minimal (non-)working example, I'm all ears. It's a float32 DEM with ICESat-2 points that should work just like in the test. But it doesn't!
Wait what the heck is the difference between Raster.value_at_coords
and Raster.interp_points
?? I see that they have almost the same docstring and I've invariably used both interchangeably.
I have a deadline today, will look in more details tomorrow.
Quick answer on your last question:
- The
value_at_coords
function extracts the closest point value to the coords or its average (any type of reductor function) within a square window of a certain size (doesn't require to load the entire Raster). - The
interp_points
function performs a regular grid interpolation (nearest, linear, cubic) to extract the values at the coords.
More details in the examples: https://geoutils.readthedocs.io/en/stable/analysis_examples/index.html#point-extraction But, yes, we definitely need to change the names and improve the docstrings of those functions...
I have a deadline today, will look in more details tomorrow.
Quick answer on your last question:
* The `value_at_coords` function extracts the closest point value to the coords or its average (any type of reductor function) within a square window of a certain size (doesn't require to load the entire Raster). * The `interp_points` function performs a regular grid interpolation (nearest, linear, cubic) to extract the values at the coords.
More details in the examples: https://geoutils.readthedocs.io/en/stable/analysis_examples/index.html#point-extraction But, yes, we definitely need to change the names and improve the docstrings of those functions...
Thanks for the info! That makes more sense now. But perhaps they could just be one function with an "interpolate" kwarg to value_at_coords
? Either way, a docstring improvement would be great! I'll add an issue in geoutils.