Incorrect intersection and difference results
This might be a difficult-to-solve issue, but I get some weird results from polygon overlays. One example follows.
I have two valid polygons (attatched), where geom2 has a very thin hole in it. The intersection with geom1 should be everything except the hole, but the resulting geometry is only the hole. The opposite is the case after difference overlay.
I haven't been able to install geosop, but have tested that this happens in QGIS with GEOS 3.12, as well as in QGIS with GEOS 3.11.2, shapely with GEOS 3.11.1 and sf with GEOS 3.7.
So, that hole is about a nanometer wide. This kind of argues for ST_MakeClean, again, something to take "valid" geometries and remove ridiculous components like "holes that are less than a molecule wide", and various other structurally useless things like spikes and gores and ultra-fine zigzags. We have some prior art of these kinds of inputs, but dealing with them is still an unsolved problem. You might find that using the fixed precision variant of ST_Intersection results in a "more correct" answer (hopefully) although at the cost of very slightly rounding all your vertices in the process. If you don't have exact edge-matching in your data set it might be a useful approach.
Thanks for the answer @pramsey!
Context: we are in the process of refactoring a statistics production previously done without trouble in ArcGIS, so these kinds of differences are interesting.
I realize this is a tiny hole and easily solvable. This is one of many examples.
The more problematic issues (should maybe be a separate issue?) are when it’s spikes, as you mention. The best solution I have, small negative buffer, often enough raises a topology exception. I would happily take some tips on how to deal with this.
In this case as well, I'd be interested to know if something like GEOSIntersectionPrec() helps. If you're running in PostGIS, the three-parameter version of ST_Intersection(). If the tolerance is big enough it should collapse some of these things down.
Not running PostGIS, but geopandas/shapely. The shapely overlay operations offer a grid_size parameter, which looks like what GEOSIntersectionPrec is doing.
I have tried this previously, with no success (a lot of geometries disappeared with no warning), but I will try again.
Hum, disappearing geometries seems bad, particularly since the *Prec code path is supposed to drop us into our most reliable algorithms (with the caveat that every vertex ends up snap rounded). I wonder if it's because after rounding the geometry is invalid (self-intersecting ring). @dr-jts ?
I'll have to dig into this to see what's going on. This feels like a known issue where numerical robustness failures cause the topology to be computed incorrectly, and the safeguard heuristics in OverlayNG don't detect the problem.
@mortewle Do you have some examples of "spike" cases that are causing problems?
In this case as well, I'd be interested to know if something like GEOSIntersectionPrec() helps.
I tried using JTS OverlayNG with fixed-precision Snap-Rounding (scale factor = 10^6). This causes the overlay to work correctly. (I'll confirm this in GEOS as well).
So this is a potential fix, although not ideal, since Snap-Rounding causes all coordinates to be rounded. And to use it to resolve occasional errors requires that it be used for all cases.
Yes, this example works correctly with 1e-6 rounding for me as well, thanks! The rounding of the coordinates is not a big issue in this production, but it might be in other cases.
The program takes some time to complete, but I am yet to encounter any dissappearing or doublet polygons that are wider than the grid size. There might have been some other issue when I last tested this approach.
A reduced example showing the same behaviour:
A
POLYGON ((458435 7432225, 458435 7432245, 458455 7432245, 458455 7432225, 458435 7432225))
B
POLYGON ((458500 7432200, 458400 7432200, 458400 7432300, 458500 7432300, 458500 7432200), (458454.7221242461 7432226.377759292, 458454.13999999897 7432223.769999999, 458455.62000000087 7432230.3999999985, 458454.7221242461 7432226.377759292))
Intersection
POLYGON ((458454.41457013536 7432225, 458455 7432227.622567566, 458454.7221242461 7432226.377759292, 458454.41457013536 7432225))
Analysis
The A polygon hole is a triangle with apex on the left, and with nearly-coincident linework. Computing the overlay edge topology introduces new vertices in the hole edges. Due to finite-precision representation of the intersection points, they move slightly, and the triangle apex point ends up on the right side of the intersection segment. This inversion of the triangle likely corrupts the overall topology, so that the intersection result is determined to be the triangle, rather than the larger square.
Input hole with apex on left
Result triangle with apex on right
Workaround
Use Overlay with Snap-Rounding by providing a grid size/scale factor.
Potential Fix
This case can be evaluated more accurately using snapping. Snapping will be invoked if the incorrect result can be detected after the initial full-precision overlay. One way to do this is to use a different, more robust algorithm to compute the expected area of the intersection result. The area of the incorrect computed result would be very different, which would allow a TopologyException` to be thrown. Then fallback snapping would be invoked. The challenge is how to implement a robust Intersection-Area algorithm which does not degrade performance too much.
Another way to detect the invalid computed topology is to check the location of other vertices in the polygons (via point-in-polygon tests). If a directly-computed location does not match the location determined from propagating edge locations through the topology graph, a TopologyException can be thrown. The challenge here is choosing a good subset of vertices to test.