geo
geo copied to clipboard
simplify_vw_preserve causing self-intersecting ring
We have some geometry code which would benefit from avoiding self-intersections after simplification. I tried to switch from the rdp algorithm to the simplify_vw_preserve
function, and wrote some property tests. After fuzzing a bit, quickcheck came up with this counterexample:
use geo::polygon;
use geo::SimplifyVwPreserve;
let pol = polygon![
(x: 1., y: 4.),
(x: 3., y: 4.),
(x: 1., y: 1.),
(x: 7., y: 0.),
(x: 1., y: 0.),
(x: 0., y: 1.),
(x: 1., y: 4.),
];
let simplified = pol.simplify_vw_preserve(&2.25);
assert_eq!(simplified, polygon![
(x: 1., y: 4.),
(x: 3., y: 4.),
(x: 1., y: 1.),
(x: 7., y: 0.),
(x: 1., y: 0.),
(x: 1., y: 4.),
]);
This simplified object has a self-intersection at (x: 1., y: 1.)
.
Tested on geo version 0.25 and 0.26.
If I cyclically permute the coordinates, there is no self-intersection. So this is probably related to handling of the exterior being a linear ring.
Edit: Nope, one more cyclic permutation makes the issue reappear. I think the first cyclic permutation just made the algorithm not consider that particular triangle in question. So the bug is in the algorithm itself.
I guess the cartesian_intersect
function does not detect intersections with endpoints of lines?
Thanks for catching the possible source of the problem @audunska. I'll take a look as soon as I can.
Do you have a specific example of an endpoint intersection detection miss? I'd like to test it against some variations in the algorithm.
You can use the points from the simplified geometry above:
l1 = Line::new((1., 1.), (7., 0.));
l2 = Line::new((1., 0.), (1., 4.));
assert!(cartesian_intersects(l1.start_point(), l1.end_point(), l2.start_point(), l2.end_point()));
The point (1., 1.)
of the first line touches the second line.
On a side note, I ended up implementing the algorithm described in this paper instead, but optimized using an r* tree in the same way as SimplifyVwPreserve: https://doi.org/10.1016/j.cageo.2013.08.011
Would there be interest in contributing that implementation, maybe?
I've tuned the algorithm a bit to:
- be more aggressive in deciding what is an intersection, by using the standard
geo
intersection check for new lines and candidate existing lines. - Reduced the minimum points from 6 to 5, as using 6 was causing your example to exit early while the polygon was in an invalid state. I suspect we might still run into this problem, but I'm not sure it can be completely fixed without implementing some sort of full "rewind" if we hit the minimum point limit while the polygon is invalid.
Could you use my branch (https://github.com/urschrei/geo/tree/main) for your prop tests? I'm not 100% sure if this will work as a fix, but let's see.
In answer to your topology-preserving RDP impl: yes, we'd be delighted with a contribution!
(My fix passes our existing tests and your specific test, to be clear)
Got a failure with this linestring on your branch:
let ls = line_string![
(x: -4., y: 3.),
(x: 4., y: 3.),
(x: 1., y: 2.),
(x: 20., y:-3.),
(x: 0., y:-3.),
(x: 0., y: 2.),
(x: -4., y: 3.),
];
let expected_simplified = line_string![
(x: -4., y: 3.),
(x: 20., y:-3.),
(x: 0., y:-3.),
(x: 0., y: 2.),
(x: -4., y: 3.),
];
assert_eq!(ls.simplify_vw_preserve(&6.25), expected_simplified);
As far as I can see from testing, the current topology-preserving algorithm can't deal with this case:
The topology-preserving logic tries to fix intersections by removing the previous point (p -1
) in the line if possible if p
causes an intersection. However, in this case that doesn't work because the point causing the intersection (1, 2)
which has index 2
has two adjacent triangles it can check for removal candidates: (-1, 0, 3)
and (0, 3, 4)
. The first can't be used because index 0 is an edge, and the second is skipped because its index (3
) is higher than the current point's (2)
, which means it's the next point, not the previous one.
But even if that weren't the case, the "problem" retained point has index 5, and the simplification algorithm never looks at it because its triangle (4, 5, 6)
area is too large (10
). This is why this input also fails using the standard VW algorithm.
JTS outputs the correct LineString (0, 3, 4, 6)
for that tolerance (tested using TestBuilder), but I don't know how its topology-preservation logic is implemented…
Update: JTS uses epsilon ^ 2
in its VW algorithm, which led me to think that its topology-preservation logic was being used, when in fact it was simply removing the vertex because the triangle area is below the tolerance (39.0625). GEOS and PostGIS produce the same valid linestring / invalid polygon as geo
.
Thinking about this more, the reason the preservation process is failing in the latest case is because of the rule preventing removal of a triangle that forms part of the geometry's "edge". If this case were part of a larger geometry (i.e. there are non-removed points preceding index 0 (Point(-4, 3)
), that stopping rule wouldn't be applied and Point(20, -3)
would be removed, removing that self-intersection.
As quickcheck tries to find the most simple counter-examples it can, it's going to continue trying small geometries, which are simply far more likely to hit a stopping rule than larger "real world" geometries. This particular case can't be fixed – there aren't enough points left to remove when the intersection occurs.
Sure! I guess it just means that this algorithm tries hard to preserve topology, but can't quite guarantee it. Should probably document that fact :)