overpass-api-python-wrapper icon indicating copy to clipboard operation
overpass-api-python-wrapper copied to clipboard

Multipolygons are breaking geojson convertion

Open Monstrofil opened this issue 4 years ago • 6 comments

When trying to fetch multipolygons (e.g. 11038555), I get error:

Received corrupt data from Overpass (incomplete polygon).

It seems that happens because outer and inner members are not always closed: they may be just ways.

Are you planning to improve this?

Monstrofil avatar Feb 01 '21 13:02 Monstrofil

UPD: I did this patch for me, it works, but it definitely need to be slightly improved before merge:

Index: overpass/api.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- overpass/api.py	(revision cd544c59c3546d442df0b5b16853ec555dd34def)
+++ overpass/api.py	(date 1612186223887)
@@ -9,7 +9,10 @@
 import geojson
 import logging
 from datetime import datetime
-from shapely.geometry import Polygon, Point
+
+from geojson import MultiLineString
+from shapely.geometry import Polygon, Point, mapping, LineString
+from shapely import geometry as geom, ops
 from io import StringIO
 from .errors import (
     OverpassSyntaxError,
@@ -222,40 +225,43 @@
             elif elem_type == "relation":
                 # Initialize polygon list
                 polygons = []
+                outer_ways = []
+                inner_ways = []
                 # First obtain the outer polygons
                 for member in elem.get("members", []):
-                    if member["role"] == "outer":
-                        points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
-                        # Check that the outer polygon is complete
-                        if points and points[-1] == points[0]:
-                            polygons.append([points])
-                        else:
-                            raise UnknownOverpassError("Received corrupt data from Overpass (incomplete polygon).")
-                # Then get the inner polygons
-                for member in elem.get("members", []):
-                    if member["role"] == "inner":
-                        points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
-                        # Check that the inner polygon is complete
-                        if points and points[-1] == points[0]:
-                            # We need to check to which outer polygon the inner polygon belongs
-                            point = Point(points[0])
-                            check = False
-                            for poly in polygons:
-                                polygon = Polygon(poly[0])
-                                if polygon.contains(point):
-                                    poly.append(points)
-                                    check = True
-                                    break
-                            if not check:
-                                raise UnknownOverpassError("Received corrupt data from Overpass (inner polygon cannot "
-                                                           "be matched to outer polygon).")
-                        else:
-                            raise UnknownOverpassError("Received corrupt data from Overpass (incomplete polygon).")
-                # Finally create MultiPolygon geometry
-                if polygons:
-                    geometry = geojson.MultiPolygon(polygons)
-            else:
-                raise UnknownOverpassError("Received corrupt data from Overpass (invalid element).")
+                    if member["role"] not in ["outer", "inner"]:
+                        continue
+                    points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
+                    if member["role"] == "outer":
+                        outer_ways.append(geom.LineString(points))
+                    elif member["role"] == "inner":
+                        inner_ways.append(geom.LineString(points))
+                outer_merged_line = ops.linemerge(outer_ways)
+                if isinstance(outer_merged_line, LineString):
+                    iterator = [outer_merged_line]
+                else:
+                    iterator = outer_merged_line.geoms
+                for line in iterator:
+                    polygons.append(Polygon(line.coords))
+
+                inner_merged_line = ops.linemerge(inner_ways)
+                if isinstance(inner_merged_line, LineString):
+                    iterator = [inner_merged_line]
+                else:
+                    iterator = inner_merged_line.geoms
+                for line in iterator:
+                    inner_poly = Polygon(line.coords)
+
+                    tmp_polygons = []
+                    for poly in polygons[:]:
+                        if poly.contains(inner_poly):
+                            poly -= inner_poly
+                        tmp_polygons.append(poly)
+                    polygons = tmp_polygons
+
+                all_polygons = [mapping(poly)['coordinates'] for poly in polygons]
+                if all_polygons:
+                    geometry = geojson.MultiPolygon(all_polygons)
 
             if geometry:
                 feature = geojson.Feature(

Monstrofil avatar Feb 01 '21 13:02 Monstrofil

Thanks for spotting and fixing!

mvexel avatar Mar 02 '21 03:03 mvexel

I'm wondering if it might be best to import another lib for converting OSM format to geojson and let this lib focus on the API stuff, rather than maintain a homegrown converter. Something like osm2geojson, perhaps?

dericke avatar Mar 04 '21 17:03 dericke

I had a quick look at replacing the built-in geojson method with osm2geojson, found that osm2geojson uses 7 decimal points of precision for coordinates, while this library has been using 6 (default from the geojson library) and osm2geojson nests the id property of a feature under properties, whereas the built-in method puts it in the top level of the feature.

dericke avatar Mar 04 '21 20:03 dericke

I think the best way to go about supporting GeoJSON properly is not to roll our own but instead support __geo_interface__. Since geojson implements this interface too, there's no need to introduce any other dependencies.

mvexel avatar Mar 22 '22 17:03 mvexel