python-distance-rasters icon indicating copy to clipboard operation
python-distance-rasters copied to clipboard

Assess alternative implementations for calculating distance

Open sgoodm opened this issue 4 years ago • 2 comments

Currently using SciPy's cKDTree along with a Haversine calculation for accurate distance metrics. cKDTree is now the same as KDTree as of SciPy 1.6.0 (See #12 ), but KDTree was previously slower.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.html https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html https://en.wikipedia.org/wiki/Haversine_formula

SciPy's Ball Tree had been tested initially and was slower http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.BallTree.html

Another SciPy approach is using distance_transform_edt. It is unclear if this could return indexes of nearest cells in order to calculate Haversine distance rather than only Euclidean. https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.distance_transform_edt.html

Finally, while gdal_proximity was determined to be insufficient and drove the need for this package, it would be worth doing creating some comparable tests and seeing if their implementation could help improve this package in some way. https://gdal.org/programs/gdal_proximity.html https://github.com/OSGeo/gdal/blob/fec15b146f8a750c23c5e765cac12ed5fc9c2b85/gdal/alg/gdalproximity.cpp

sgoodm avatar Aug 05 '21 21:08 sgoodm

With implementation of class to run distance raster instead of function (See #5 ) I have left the potential to just swap out the functions used to generate the distance array. This should make future comparisons, or implementing multiple methods, much easier.

sgoodm avatar Aug 06 '21 15:08 sgoodm

Tests using geopy were consistently slightly slower using their great-circle distance method, and significantly slower using their geodesic distance method.

sgoodm avatar Sep 13 '21 22:09 sgoodm