pyproj icon indicating copy to clipboard operation
pyproj copied to clipboard

Vectorized geodesic area / length calculations

Open brendan-ward opened this issue 8 months ago • 1 comments

As mentioned in GeoPandas #3539, there is interest in using pyproj to calculate geodesic areas and lengths. The current pyproj API (Geod::geometry_area_perimeter) for this uses singular shapely objects, which is less than ideal for (potentially large) arrays of geometries. Given the push for vectorized APIs in Shapely 2.0 and their adoption in GeoPandas, it seems like there would be considerable performance benefit to providing a vectorized API in pyproj for calculating geodesic area and length. These would need to be implemented in the Cython wrapper on top of GeographicLib (I have a starting point for geodesic area I'd be happy to share).

However, it looks like numpy is only a test dependency here rather than a hard dependency, and I think we'd need to rely on numpy (in the Cython wrapper) to vectorize this properly (i.e., accept an array of shapely geometries). I suspect that the decision about integrating numpy is the more substantive part of this idea.

Is there interest in providing vectorized capabilities for geodesic area / length? What are your thoughts about integrating numpy for this?

brendan-ward avatar Mar 31 '25 19:03 brendan-ward

There has been interest in vectorizing geodesic calculations for geopandas & cartopy. The speed benefits would be impressive. Not opposed to looking into potential integration opportunities there.

pyproj has almost zero dependencies and it would be ideal to keep it that way. It keeps maintenance & installation much simpler. However, that doesn't mean it cannot support other libraries. For example, pyproj supports numpy without depending on numpy. It does this through Python's buffer C API.

Here are locations in the code that would be helpful for reference:

If for some reason numpy cannot be supported as an optional runtime dependency for the vectorized version, an alternative to consider could be creating a tightly coupled library and exposing that library as an optional dependency through pyproj.

snowman2 avatar Mar 31 '25 20:03 snowman2