Polyline Douglas-Peucker-Algorithm - Issue Calculating Distance on Some Polylines
Summary
I have a polyline that is getting converted to a line with a single point by the simplify method. This is not my area of expertise, but when debugging, I'm seeing that the getPerpendicularDistance method is returning 0 for every coordinate on line 95 in SimplifyDouglasPeucker.php. When I use MySQL's ST_Simplify which also uses the Douglas-Peucker-Algorithm (see the docs), I'm getting a very similar line to the original with one less point. I also get identical results to MySQL when using Turf.js (I know PHPGeo uses meters vs coordinate units). I suspect something is off in the getPerpendicularDistance for some polylines.
I've tried this with several other polylines and haven't had an issue. So it seems the distance method just struggles with certain polylines.
Details
The Polyline
The original GeoJSON -
{"type":"LineString","coordinates":[[-95.827709,37.527891],[-95.827624,37.527886],[-95.827531,37.527872],[-95.827464,37.527848],[-95.82742,37.52782],[-95.827312,37.527805],[-95.827237,37.527765],[-95.8272,37.527703],[-95.827111,37.527676],[-95.827083,37.527592],[-95.827014,37.527531],[-95.826937,37.527535],[-95.826841,37.527549],[-95.826848,37.527591],[-95.826828,37.527716],[-95.826892,37.527824],[-95.826835,37.527946],[-95.826863,37.528049],[-95.826808,37.528136],[-95.826859,37.528222],[-95.826932,37.528266],[-95.827048,37.52827],[-95.827155,37.52822],[-95.827232,37.528251],[-95.827373,37.528222],[-95.82747,37.528271],[-95.827534,37.528316],[-95.827636,37.528338],[-95.827762,37.52836],[-95.827876,37.528395],[-95.828052,37.528404],[-95.828187,37.528402],[-95.828294,37.52837],[-95.82843,37.528355],[-95.828547,37.528327],[-95.82866,37.528258],[-95.828774,37.528233],[-95.82885,37.528176],[-95.828926,37.528096],[-95.828985,37.528005],[-95.829,37.527891],[-95.828952,37.527849],[-95.828823,37.527858],[-95.828711,37.527899],[-95.828595,37.527977],[-95.828574,37.528026],[-95.828523,37.528097],[-95.828453,37.528164],[-95.828345,37.528204],[-95.828271,37.528194],[-95.828202,37.528172],[-95.828057,37.528115],[-95.827869,37.527969],[-95.827709,37.527891]]}
Simplification with PHPGeo
Code -
use Location\Processor\Polyline\SimplifyDouglasPeucker;
// I've tried using a bunch of different tolerance values ranging from 0.00003 to 1000 but I always get the same result
$processor = new SimplifyDouglasPeucker(0.5);
$simplified = $processor->simplify($polyline);
PHPGeo result -
{
"type": "LineString",
"coordinates": [
[
-95.827709,
37.527891
],
[
-95.827709,
37.527891
]
]
}
Simplification with MySQL
I'm obviously using a different tolerance here because MySQL uses different units. So, I wouldn't expect the PHPGeo result to be exactly the same. I just use this as a demonstration that a simplified line string can be returned using the Douglas-Peucker-Algorithm.
Query -
SELECT ST_SIMPLIFY(linestring_col, 0.000003) FROM table
MySQL ST_Simplify result -
{"type": "LineString", "coordinates": [[-95.827709, 37.527891], [-95.827624, 37.527886], [-95.827531, 37.527872], [-95.827464, 37.527848], [-95.82742, 37.52782], [-95.827312, 37.527805], [-95.827237, 37.527765], [-95.8272, 37.527703], [-95.827111, 37.527676], [-95.827083, 37.527592], [-95.827014, 37.527531], [-95.826937, 37.527535], [-95.826841, 37.527549], [-95.826848, 37.527591], [-95.826828, 37.527716], [-95.826892, 37.527824], [-95.826835, 37.527946], [-95.826863, 37.528049], [-95.826808, 37.528136], [-95.826859, 37.528222], [-95.826932, 37.528266], [-95.827048, 37.52827], [-95.827155, 37.52822], [-95.827232, 37.528251], [-95.827373, 37.528222], [-95.82747, 37.528271], [-95.827534, 37.528316], [-95.827762, 37.52836], [-95.827876, 37.528395], [-95.828052, 37.528404], [-95.828187, 37.528402], [-95.828294, 37.52837], [-95.82843, 37.528355], [-95.828547, 37.528327], [-95.82866, 37.528258], [-95.828774, 37.528233], [-95.82885, 37.528176], [-95.828926, 37.528096], [-95.828985, 37.528005], [-95.829, 37.527891], [-95.828952, 37.527849], [-95.828823, 37.527858], [-95.828711, 37.527899], [-95.828595, 37.527977], [-95.828574, 37.528026], [-95.828523, 37.528097], [-95.828453, 37.528164], [-95.828345, 37.528204], [-95.828271, 37.528194], [-95.828202, 37.528172], [-95.828057, 37.528115], [-95.827869, 37.527969], [-95.827709, 37.527891]]}```
Hi, thanks for this report.
Upon reviewing the input data, I've noticed that the first and last points are identical, indicating that your input is a Polygon rather than a Polyline. The current functionality operates as expected when you remove either the first or last point from the input data.
I would categorize this as a feature request (“Add polygon support for Douglas-Peucker”) rather than a bug, because the simplify() method is designed to accept only a polyline, as clearly specified by its argument type.
The issue arises because the current Douglas-Peucker implementation uses the first and last points to form the baseline. If these two points are the same, it results in a zero-length segment, as shown here.
In the meantime, I recommend using the SimplifyBearing algorithm until we develop a solution for this.
I ran into the same issue when using polygon coordinates. As I know my data is always a polygon, I simply remove the last coordinate, simplify and then add the last coordinate back in.
So maybe a check could be made, if the first and last coordinates are the same?
Anyway here's part of what I was doing, in case it's useful for anyone else (note I start with an array of coordinates):
$polyline = new Polyline();
// Remove last point, to allow for polyline simplification.
$lastCoordinates = array_pop($ring);
foreach ($ring as $coord) {
$polyline->addPoint(new Coordinate($coord[1], $coord[0]));
}
// Simplify with tolerance in metres.
$processor = new SimplifyDouglasPeucker($tolerance);
$simplifiedPolyline = $processor->simplify($polyline);
// Add last point back in (which closes the ring).
$simplifiedPolyline->addPoint(new Coordinate($lastCoordinates[1], $lastCoordinates[0]));
// Convert back to coordinates array lng/lat.
$simplifiedRing = [];
foreach ($simplifiedPolyline->getPoints() as $point) {
$simplifiedRing[] = [$point->getLng(), $point->getLat()];
}
Edited, fixed a typo, and removed an incorrect comment about tolerance.