geoutils icon indicating copy to clipboard operation
geoutils copied to clipboard

Forced `interpolation="None"` in `plot()` to avoid NaN artefacts causes large vectorized file size

Open rhugonnet opened this issue 11 months ago • 2 comments

And, as we override the default, it's not parametrizable with the rcParams of matplotlib anymore, the only way is to override everywhere with the interpolation= parameter at every call of plot(), which is unpractical.

To give an idea of the impact: Plotting a 100MB DEM with our forced interpolation="None": no matter the figsize or dpi arguments, the PDF figure file is always around 80MB.

rhugonnet avatar Dec 12 '24 00:12 rhugonnet

I remember now that we forced "None" to avoid matplotlib propagating NaNs into big areas... It looks extremely bad on rasters with small NaN gaps distributed everywhere. We could almost think about running by default our own SciPy interpolation to downsample the raster to a reasonable size for a plot (as an additional argument to plot), then set interpolation="None" via geoutils.config which can be overridden globally? This way there would be full argument/global control for the user, and still a good default plot() that saves file to a reasonable size.

rhugonnet avatar Dec 12 '24 01:12 rhugonnet

I think before going down the road of downsampling on our side (which might be less transparent/understable for the users), I would first try with different interpolation schemes: https://matplotlib.org/stable/gallery/images_contours_and_fields/interpolation_methods.html Maybe "nearest" would solve the issue of NaN propagation while reducing image size? But it's indeed still a problem that the parameter is overwritten for the rest of the script...

adehecq avatar Jan 29 '25 15:01 adehecq