momepy icon indicating copy to clipboard operation
momepy copied to clipboard

ENH: geometry-based simplification of roundabouts

Open gregmaya opened this issue 3 years ago • 7 comments

I'm sure there might be corrections to be made (including one BUG FIX that I found whilst doing the test!) but I believe submitting is helpful to move forward.

gregmaya avatar Jul 31 '22 08:07 gregmaya

Hey, I have renamed the PR and resolved the merge conflict.

  1. When creating a PR, try to always ensure that it is based on the latest version of main. When it is a simple one, make sure that your local repo is up to date and then make a new branch from main. When it is a larger one that takes time, do the same in the beginning and then merge main to the branch before creating a PR
git fetch upstream
git merge upstream/main
  1. make sure you follow the style guide. As you can see from the linting check, it does not follow black at the moment. Optionally (but recommended), you can setup pre-commit hooks to automatically run black when you make a git commit.
python -m pip install pre-commit

From the root of the momepy repository, you should then install the pre-commit included in GeoPandas:

pre-commit install

Then black will be run automatically each time you commit changes. You can skip these checks with git commit --no-verify.

I'll check the actual contents later ;).

martinfleis avatar Jul 31 '22 09:07 martinfleis

Wow.! Ok definitely PRs are a science in itself! .. Thanks for the explanations... although some still sound foreign to me I'll have your recommendations to start Googling. Many thanks @martinfleis !

gregmaya avatar Jul 31 '22 10:07 gregmaya

sorry I marked the PR as 'ready for review' by mistake! (not sure how to undo 😅)

gregmaya avatar Jul 31 '22 11:07 gregmaya

Hoping the auto-BLACK pre-commit action is actually working. 🧐

gregmaya avatar Aug 10 '22 08:08 gregmaya

Hoping the auto-BLACK pre-commit action is actually working.

It looks it is :).

martinfleis avatar Aug 10 '22 09:08 martinfleis

"touches" is not a valid predicate argument in geopandas v0.8.2, which is the version being pulled in for the 3.8-minimal CI run, resulting in the current failure.

jGaboardi avatar Aug 12 '22 13:08 jGaboardi

I wouldn't bother trying to make it work for 0.8.2 to be fair. There's too many constraints.

martinfleis avatar Aug 12 '22 13:08 martinfleis

@gregmaya give us a shout if you want another round of reviews.

martinfleis avatar Aug 16 '22 19:08 martinfleis

@gregmaya give us a shout if you want another round of reviews.

A review would be appreciated, yes! @martinfleis I really want to move to a selection algorithm (which I've already started some explorations)... but obviously leaving something working here is crucial. For the sake of future improvements on this roundabout_simplification I'm going to list a few ideas to include/solve:

  1. The current selection part of the algorithm is classifying many false positives (i.e. all those roundabouts that are 'cut in half' by streets continuing their original direction.
  2. Include an extra column on the output with a label that highlights the streets that were extended and were 'incoming to a roundabout'.
  3. Create an optional return with the deleted streets as a separate gdf. (points 2 and 3 could be implemented in all the functions that aim at network simplification)
  4. Overall improvements of inefficient coding.

gregmaya avatar Aug 17 '22 07:08 gregmaya

If you pass a dataframe with MultiLinestrings, it fails.

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Untitled-1.ipynb Cell 4 in <cell line: 1>()
----> 1 simplified = momepy.roundabout_simplification(df)

File ~/Git/momepy/momepy/preprocessing.py:1053, in roundabout_simplification(edges, polys, circom_threshold, area_threshold, include_adjacent, center_type, angle_threshold)
   1046 rab = _selecting_rabs_from_poly(
   1047     polys,
   1048     circom_threshold=circom_threshold,
   1049     area_threshold=area_threshold,
   1050     include_adjacent=include_adjacent,
   1051 )
   1052 rab_multipolygons = _rabs_center_points(rab, center_type=center_type)
-> 1053 incoming_all, idx_drop = _selecting_incoming_lines(
   1054     rab_multipolygons, edges, angle_threshold=angle_threshold
   1055 )
   1056 output = _ext_lines_to_center(edges, incoming_all, idx_drop)
   1058 return output

File ~/Git/momepy/momepy/preprocessing.py:915, in _selecting_incoming_lines(rab_multipolygons, edges, angle_threshold)
    912 incoming = touching.loc[ls]
    914 # figuring out which ends of incoming edges needs to be connected to the center_pt
--> 915 incoming["first_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[0]))
    916 incoming["dist_fisrt_pt"] = incoming.center_pt.distance(incoming.first_pt)
    917 incoming["last_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[-1]))

File ~/Git/geopandas/geopandas/geoseries.py:643, in GeoSeries.apply(self, func, convert_dtype, args, **kwargs)
    641 @doc(pd.Series)
    642 def apply(self, func, convert_dtype=True, args=(), **kwargs):
--> 643     result = super().apply(func, convert_dtype=convert_dtype, args=args, **kwargs)
    644     if isinstance(result, GeoSeries):
    645         if self.crs is not None:

File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/series.py:4433, in Series.apply(self, func, convert_dtype, args, **kwargs)
   4323 def apply(
   4324     self,
   4325     func: AggFuncType,
   (...)
   4328     **kwargs,
   4329 ) -> DataFrame | Series:
   4330     """
   4331     Invoke function on values of Series.
   4332 
   (...)
   4431     dtype: float64
   4432     """
-> 4433     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()

File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/apply.py:1088, in SeriesApply.apply(self)
   1084 if isinstance(self.f, str):
   1085     # if we are a string, try to dispatch
   1086     return self.apply_str()
-> 1088 return self.apply_standard()

File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/apply.py:1143, in SeriesApply.apply_standard(self)
   1137         values = obj.astype(object)._values
   1138         # error: Argument 2 to "map_infer" has incompatible type
   1139         # "Union[Callable[..., Any], str, List[Union[Callable[..., Any], str]],
   1140         # Dict[Hashable, Union[Union[Callable[..., Any], str],
   1141         # List[Union[Callable[..., Any], str]]]]]"; expected
   1142         # "Callable[[Any], Any]"
-> 1143         mapped = lib.map_infer(
   1144             values,
   1145             f,  # type: ignore[arg-type]
   1146             convert=self.convert_dtype,
   1147         )
   1149 if len(mapped) and isinstance(mapped[0], ABCSeries):
   1150     # GH#43986 Need to do list(mapped) in order to get treated as nested
   1151     #  See also GH#25959 regarding EA support
   1152     return obj._constructor_expanddim(list(mapped), index=obj.index)

File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/_libs/lib.pyx:2870, in pandas._libs.lib.map_infer()

File ~/Git/momepy/momepy/preprocessing.py:915, in _selecting_incoming_lines.<locals>.<lambda>(x)
    912 incoming = touching.loc[ls]
    914 # figuring out which ends of incoming edges needs to be connected to the center_pt
--> 915 incoming["first_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[0]))
    916 incoming["dist_fisrt_pt"] = incoming.center_pt.distance(incoming.first_pt)
    917 incoming["last_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[-1]))

File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/shapely/geometry/base.py:954, in BaseMultipartGeometry.coords(self)
    952 @property
    953 def coords(self):
--> 954     raise NotImplementedError(
    955         "Multi-part geometries do not provide a coordinate sequence")

NotImplementedError: Multi-part geometries do not provide a coordinate sequence

We should probably capture it and raise an informative error pointing towards explode() before simplification.

martinfleis avatar Aug 23 '22 12:08 martinfleis

There are some warnings emitted by pandas.

/Users/martin/Git/momepy/momepy/preprocessing.py:802: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rab_adj.hdist.loc[g.Index] = hdist
/Users/martin/Git/momepy/momepy/preprocessing.py:802: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rab_adj.hdist.loc[g.Index] = hdist
/Users/martin/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:122: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  arr = construct_1d_object_array_from_listlike(values)

Can you try to avoid them?

martinfleis avatar Aug 23 '22 12:08 martinfleis

@martinfleis I have updated the code to incorporate a lot of what we discussed yesterday:

  • I went through the code and made sure I avoided pandas warnings (when possible - quite a few cases!)
  • I added an assert on GeoDataFrames with non-LineStrings
  • fixed some comments and minor stuff
  • and created an extra column on the output to label the modified edges with their corresponding rounabout_number. -> to check the total number of roundabouts you can just do output_gdf.simpl_edge.nunique()

gregmaya avatar Aug 24 '22 14:08 gregmaya

@martinfleis I've combined all the suggestions by @jGaboardi as well as included a diameter_factor for including adjacent polygons that stretch a bit more than what was considered before. I cannot figure out what the error is while trying to build the Read the Docs so a hint there would be nice to potentially merge this PR?! :)

gregmaya avatar Sep 07 '22 10:09 gregmaya

@gregmaya We are also seeing the Read The Docs error in #382, so it's not specific to this PR. However, I am wondering what is going on with the 3.8 and 3.10-dev tests.

jGaboardi avatar Sep 07 '22 12:09 jGaboardi

3.8-minimal has old geopandas. That needs to be catched and we should raise an error saying that the function requires newer version than 0.x.0 (need to check which is min).

3.10 is some issue with networkx main, ignore it here.

martinfleis avatar Sep 07 '22 12:09 martinfleis

3.8-minimal has old geopandas. That needs to be catched and we should raise an error saying that the function requires newer version than 0.x.0 (need to check which is min).

3.10 is some issue with networkx main, ignore it here.

.... So drumroll 🥁...?! 🙏😅

gregmaya avatar Sep 07 '22 12:09 gregmaya

by "that needs to be catched" I meant that you need to catch it and raise a warning :)

martinfleis avatar Sep 07 '22 12:09 martinfleis

congrats @gregmaya!!! :tada:

jGaboardi avatar Sep 09 '22 17:09 jGaboardi