geoarrow-rs icon indicating copy to clipboard operation
geoarrow-rs copied to clipboard

Explode and overline

Open Robinlovelace opened this issue 1 year ago • 5 comments

Starting with the high level ask, we have a function that does the following:

import geopandas as gpd
from shapely.geometry import LineString

sl = gpd.GeoDataFrame(
    {
        "id": ["a", "b"],
        "value": [1, 2]
    },
    geometry = [
        LineString([(0, 0), (1, 1), (2, 2)]),
        LineString([(0, 1), (1, 1), (2, 2)])
    ]
)

# plot with values:
sl.plot(column = "value")
# The output should be along the lines of:
sl_overline = gpd.GeoDataFrame(
    {
        "value": [1, 2, 3]
    },
    geometry = [
        LineString([(0, 0), (1, 1)]),
        LineString([(0, 1), (1, 1)]),
        LineString([(1, 1), (2, 2)])
    ]
)
sl_overline.plot(column = "value")

image

That may seem simple but there are several steps:

  • [x] exploding the geometries, as implemented in native QGIS algorithm and in GeoPandas feature request: https://github.com/geopandas/geopandas/issues/2476
  • [ ] re-ordering linestrings to prevent duplicate lines with different coordinate order
  • [ ] aggregating the values (I suggest that this done as a separate step, possibly optionally, to give users control over aggregating functions and variables, ideally with the power of polars)
  • [ ] merging the exploded linestrings into longer linestrings, that are as long as possible without aggregated values changing
  • [ ] returning joined-up geometries

References

  • There is a long-standing and well-used R implementation, a breakdown of which can be found here (source of the Python reproducible example above): https://github.com/Robinlovelace/overline-tests
  • There is a paused attempt at a pure Rust implementation: https://github.com/acteng/overline/tree/master
  • Here's an example of the outputs, detailed route networks with attribute values (in this case representing cycling potential, zoom in to see): https://dev.cruse.bike/ or https://npt.scot

Robinlovelace avatar Jan 24 '24 01:01 Robinlovelace

Update on thinking: I don't think geoarrow-rs needs to do the whole thing: there are many options in the summarise -> aggregation step that are worth exposing to the user. Just getting the exploded+ordered linestrings alongside their attributes would be enough I think.

The merging the linestrings back together step is another bit that would massively benefit from being done here.

Robinlovelace avatar Jan 24 '24 09:01 Robinlovelace

I don't think geoarrow-rs needs to do the whole thing

Yeah this was going to be a point of discussion. The goal for now is implementing as general operations as possible. So explode makes sense because it's very general. In terms of "explode segments" I'm not sure what the most general API is. We could have an explode_segments method on LineStringArray and ChunkedLineStringArray which return a LineStringArray of length-2 lines as well as indices to pass into a take operation. So something roughly like

table = GeoTable(...)
geometry = table.geometry
assert isinstance(geometry, ChunkedLineStringArray)
exploded_geometry, indices = geometry.explode_segments()
exploded_table = table.remove_geometry()[indices].add_column(exploded_geometry)

The other note is that the goal with geoarrow is to make sharing geometries across libraries zero-cost because the geometry format is ABI-stable. So a possibility is to implement some core operations in the geoarrow.rust.core Python package, but if there are other operations with a more narrow use case, have another python package like geoarrow.road_networks to do those. And each package can share data totally transparently and at zero-cost

kylebarron avatar Jan 24 '24 15:01 kylebarron

General :+1: to the sentiment here without commenting on details..

Robinlovelace avatar Jan 24 '24 15:01 Robinlovelace

Heads-up @wangzhao0217 who may be interested in taking a look and may have questions. Would love to take this forward and provide input to move forward on some of these tasks.

Robinlovelace avatar Mar 07 '24 11:03 Robinlovelace

Awesome! Let me know if I can help provide pointers or anything

kylebarron avatar Mar 07 '24 17:03 kylebarron