momepy
momepy copied to clipboard
ENH: geometry-based simplification of roundabouts
I'm sure there might be corrections to be made (including one BUG FIX that I found whilst doing the test!) but I believe submitting is helpful to move forward.
Hey, I have renamed the PR and resolved the merge conflict.
- When creating a PR, try to always ensure that it is based on the latest version of
main. When it is a simple one, make sure that your local repo is up to date and then make a new branch from main. When it is a larger one that takes time, do the same in the beginning and then merge main to the branch before creating a PR
git fetch upstream
git merge upstream/main
- make sure you follow the style guide. As you can see from the linting check, it does not follow
blackat the moment. Optionally (but recommended), you can setup pre-commit hooks to automatically run black when you make a git commit.
python -m pip install pre-commit
From the root of the momepy repository, you should then install the pre-commit included in GeoPandas:
pre-commit install
Then black will be run automatically each time you commit changes. You can skip these checks with git commit --no-verify.
I'll check the actual contents later ;).
Wow.! Ok definitely PRs are a science in itself! .. Thanks for the explanations... although some still sound foreign to me I'll have your recommendations to start Googling. Many thanks @martinfleis !
sorry I marked the PR as 'ready for review' by mistake! (not sure how to undo 😅)
Hoping the auto-BLACK pre-commit action is actually working. 🧐
Hoping the auto-BLACK pre-commit action is actually working.
It looks it is :).
"touches" is not a valid predicate argument in geopandas v0.8.2, which is the version being pulled in for the 3.8-minimal CI run, resulting in the current failure.
I wouldn't bother trying to make it work for 0.8.2 to be fair. There's too many constraints.
@gregmaya give us a shout if you want another round of reviews.
@gregmaya give us a shout if you want another round of reviews.
A review would be appreciated, yes! @martinfleis
I really want to move to a selection algorithm (which I've already started some explorations)... but obviously leaving something working here is crucial.
For the sake of future improvements on this roundabout_simplification I'm going to list a few ideas to include/solve:
- The current selection part of the algorithm is classifying many false positives (i.e. all those roundabouts that are 'cut in half' by streets continuing their original direction.
- Include an extra column on the output with a label that highlights the streets that were extended and were 'incoming to a roundabout'.
- Create an optional return with the deleted streets as a separate gdf. (points 2 and 3 could be implemented in all the functions that aim at network simplification)
- Overall improvements of inefficient coding.
If you pass a dataframe with MultiLinestrings, it fails.
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
Untitled-1.ipynb Cell 4 in <cell line: 1>()
----> 1 simplified = momepy.roundabout_simplification(df)
File ~/Git/momepy/momepy/preprocessing.py:1053, in roundabout_simplification(edges, polys, circom_threshold, area_threshold, include_adjacent, center_type, angle_threshold)
1046 rab = _selecting_rabs_from_poly(
1047 polys,
1048 circom_threshold=circom_threshold,
1049 area_threshold=area_threshold,
1050 include_adjacent=include_adjacent,
1051 )
1052 rab_multipolygons = _rabs_center_points(rab, center_type=center_type)
-> 1053 incoming_all, idx_drop = _selecting_incoming_lines(
1054 rab_multipolygons, edges, angle_threshold=angle_threshold
1055 )
1056 output = _ext_lines_to_center(edges, incoming_all, idx_drop)
1058 return output
File ~/Git/momepy/momepy/preprocessing.py:915, in _selecting_incoming_lines(rab_multipolygons, edges, angle_threshold)
912 incoming = touching.loc[ls]
914 # figuring out which ends of incoming edges needs to be connected to the center_pt
--> 915 incoming["first_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[0]))
916 incoming["dist_fisrt_pt"] = incoming.center_pt.distance(incoming.first_pt)
917 incoming["last_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[-1]))
File ~/Git/geopandas/geopandas/geoseries.py:643, in GeoSeries.apply(self, func, convert_dtype, args, **kwargs)
641 @doc(pd.Series)
642 def apply(self, func, convert_dtype=True, args=(), **kwargs):
--> 643 result = super().apply(func, convert_dtype=convert_dtype, args=args, **kwargs)
644 if isinstance(result, GeoSeries):
645 if self.crs is not None:
File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/series.py:4433, in Series.apply(self, func, convert_dtype, args, **kwargs)
4323 def apply(
4324 self,
4325 func: AggFuncType,
(...)
4328 **kwargs,
4329 ) -> DataFrame | Series:
4330 """
4331 Invoke function on values of Series.
4332
(...)
4431 dtype: float64
4432 """
-> 4433 return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/apply.py:1088, in SeriesApply.apply(self)
1084 if isinstance(self.f, str):
1085 # if we are a string, try to dispatch
1086 return self.apply_str()
-> 1088 return self.apply_standard()
File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/apply.py:1143, in SeriesApply.apply_standard(self)
1137 values = obj.astype(object)._values
1138 # error: Argument 2 to "map_infer" has incompatible type
1139 # "Union[Callable[..., Any], str, List[Union[Callable[..., Any], str]],
1140 # Dict[Hashable, Union[Union[Callable[..., Any], str],
1141 # List[Union[Callable[..., Any], str]]]]]"; expected
1142 # "Callable[[Any], Any]"
-> 1143 mapped = lib.map_infer(
1144 values,
1145 f, # type: ignore[arg-type]
1146 convert=self.convert_dtype,
1147 )
1149 if len(mapped) and isinstance(mapped[0], ABCSeries):
1150 # GH#43986 Need to do list(mapped) in order to get treated as nested
1151 # See also GH#25959 regarding EA support
1152 return obj._constructor_expanddim(list(mapped), index=obj.index)
File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/_libs/lib.pyx:2870, in pandas._libs.lib.map_infer()
File ~/Git/momepy/momepy/preprocessing.py:915, in _selecting_incoming_lines.<locals>.<lambda>(x)
912 incoming = touching.loc[ls]
914 # figuring out which ends of incoming edges needs to be connected to the center_pt
--> 915 incoming["first_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[0]))
916 incoming["dist_fisrt_pt"] = incoming.center_pt.distance(incoming.first_pt)
917 incoming["last_pt"] = incoming.geometry.apply(lambda x: Point(x.coords[-1]))
File ~/mambaforge/envs/geo_dev/lib/python3.10/site-packages/shapely/geometry/base.py:954, in BaseMultipartGeometry.coords(self)
952 @property
953 def coords(self):
--> 954 raise NotImplementedError(
955 "Multi-part geometries do not provide a coordinate sequence")
NotImplementedError: Multi-part geometries do not provide a coordinate sequence
We should probably capture it and raise an informative error pointing towards explode() before simplification.
There are some warnings emitted by pandas.
/Users/martin/Git/momepy/momepy/preprocessing.py:802: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
rab_adj.hdist.loc[g.Index] = hdist
/Users/martin/Git/momepy/momepy/preprocessing.py:802: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
rab_adj.hdist.loc[g.Index] = hdist
/Users/martin/mambaforge/envs/geo_dev/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:122: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
arr = construct_1d_object_array_from_listlike(values)
Can you try to avoid them?
@martinfleis I have updated the code to incorporate a lot of what we discussed yesterday:
- I went through the code and made sure I avoided pandas warnings (when possible - quite a few cases!)
- I added an assert on GeoDataFrames with non-LineStrings
- fixed some comments and minor stuff
- and created an extra column on the output to label the modified edges with their corresponding rounabout_number.
-> to check the total number of roundabouts you can just do
output_gdf.simpl_edge.nunique()
@martinfleis I've combined all the suggestions by @jGaboardi as well as included a diameter_factor for including adjacent polygons that stretch a bit more than what was considered before.
I cannot figure out what the error is while trying to build the Read the Docs so a hint there would be nice to potentially merge this PR?! :)
@gregmaya We are also seeing the Read The Docs error in #382, so it's not specific to this PR. However, I am wondering what is going on with the 3.8 and 3.10-dev tests.
3.8-minimal has old geopandas. That needs to be catched and we should raise an error saying that the function requires newer version than 0.x.0 (need to check which is min).
3.10 is some issue with networkx main, ignore it here.
3.8-minimal has old geopandas. That needs to be catched and we should raise an error saying that the function requires newer version than 0.x.0 (need to check which is min).
3.10 is some issue with networkx main, ignore it here.
.... So drumroll 🥁...?! 🙏😅
by "that needs to be catched" I meant that you need to catch it and raise a warning :)
congrats @gregmaya!!! :tada: