geos
geos copied to clipboard
Improve consistency of `GEOSVoronoiDiagram`
change
the getDiagramEdges now only returns a MultiLineString, whether there is one or many lines. Which reduces to two the number of possible kind of geometries given by GEOSVoronoiDiagram
reason
I've been working on adding voronoi_diagram to RGeo. And looking at the Shapely PR for voronoi diagram and mine in RGeo, it seems like the GEOSVoronoiDiagram CAPI method causes confusion by returning three possible kind of geometries:
GeometryCollection(whenonlyEdgesis false)MultiLineString(onlyEdgesis true and there are at least two lines)LineString(onlyEdgesis true and there is only one line)
The parameter onlyEdges changes the outcome, which is ok to reason about. However, the fact that we could have either a LineString or a MultiLineString is way harder to work around in my opinion.
Note for reviewer
It's been year since I haven't made a single line of C++, feel free to fix the implementation! To be honest, I've copied the code from other parts of the library, until one was working 😅
Alternative
We could also wrap the result in a GeometryCollection, to comply to the original description and be even more consistent. But I guess this would be more work here, and less efficient as well
Actually, after running a benchmark, the alternative is not really slower than the first proposition. Hence I'll change for the alternative. I'll do that in a separate commit, which will be easier to reverse.
Benchmark
///usr/bin/env g++ ${0} -std=c++11 -isystem ../google/benchmark/include -I/usr/local/include -L../google/benchmark/build/src -L/usr/local/lib -lbenchmark -lgeos -o ${0%%.cc} && ${0%%.cc} ; exit $?
#include <benchmark/benchmark.h>
#include <geos/triangulate/VoronoiDiagramBuilder.h>
#include <geos/io/WKTReader.h>
#include <geos/io/WKBReader.h>
using namespace geos::triangulate;
using namespace geos::geom;
using namespace geos::io;
#define bench(name, wkt) static void multiline_##name(benchmark::State& state) { \
geos::triangulate::VoronoiDiagramBuilder builder; \
std::unique_ptr<Geometry> geom(readTextOrHex(wkt)); \
const GeometryFactory& geomFact(*GeometryFactory::getDefaultInstance()); \
builder.setSites(*geom); \
for (auto _ : state) { \
builder.getDiagramEdges(geomFact); \
} \
} \
BENCHMARK(multiline_##name); \
static void collection_##name(benchmark::State& state) { \
geos::triangulate::VoronoiDiagramBuilder builder; \
std::unique_ptr<Geometry> geom(readTextOrHex(wkt)); \
const GeometryFactory& geomFact(*GeometryFactory::getDefaultInstance()); \
builder.setSites(*geom); \
for (auto _ : state) { \
builder.getDiagramEdgesCollection(geomFact); \
} \
} \
BENCHMARK(collection_##name);
static std::unique_ptr<Geometry>
readTextOrHex(const char* wkt)
{
WKTReader wktreader;
WKBReader wkbreader;
std::string wktstr(wkt);
if ("01" == wktstr.substr(0, 2) || "00" == wktstr.substr(0, 2))
{
std::stringstream is(wkt);
return wkbreader.readHEX(is);
}
else
return wktreader.read(wktstr);
}
bench(short, "LINESTRING (10 10, 20 20, 30 30)")
bench(long, "0104000080170000000101000080EC51B81E11A20741EC51B81E85A51C415C8FC2F528DC354001010000801F85EB5114A207415C8FC2F585A51C417B14AE47E1BA3540010100008085EB51B818A20741A8C64B3786A51C413E0AD7A3709D35400101000080000000001BA20741FED478E984A51C413E0AD7A3709D3540010100008085EB51B818A20741FED478E984A51C413E0AD7A3709D354001010000800AD7A37016A20741FED478E984A51C413E0AD7A3709D35400101000080000000001BA2074154E3A59B83A51C413E0AD7A3709D3540010100008085EB51B818A2074154E3A59B83A51C413E0AD7A3709D354001010000800AD7A37016A2074154E3A59B83A51C413E0AD7A3709D35400101000080000000001BA20741AAF1D24D82A51C413E0AD7A3709D3540010100008085EB51B818A20741AAF1D24D82A51C413E0AD7A3709D35400101000080F6285C8F12A20741EC51B81E88A51C414160E5D022DB354001010000802222222210A2074152B81EC586A51C414160E5D022DB354001010000804F1BE8B40DA2074152B81EC586A51C414160E5D022DB354001010000807B14AE470BA2074152B81EC586A51C414160E5D022DB354001010000802222222210A20741B81E856B85A51C414160E5D022DB354001010000804F1BE8B40DA20741B81E856B85A51C414160E5D022DB354001010000807B14AE470BA20741B81E856B85A51C414160E5D022DB35400101000080A70D74DA08A20741B81E856B85A51C414160E5D022DB35400101000080D4063A6D06A20741B81E856B85A51C414160E5D022DB354001010000807B14AE470BA207411F85EB1184A51C414160E5D022DB35400101000080A70D74DA08A207411F85EB1184A51C414160E5D022DB35400101000080D4063A6D06A207411F85EB1184A51C414160E5D022DB3540")
BENCHMARK_MAIN();
Run on (10 X 24.1209 MHz CPU s)
CPU Caches:
L1 Data 64 KiB
L1 Instruction 128 KiB
L2 Unified 4096 KiB (x10)
Load Average: 3.19, 2.77, 2.67
-----------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------
multiline_short 313247 ns 313187 ns 2235
collection_short 318665 ns 318316 ns 2197
multiline_long 2349749 ns 2349218 ns 294
collection_long 2407711 ns 2406856 ns 291
Personnaly I prefer this option as it is even simpler in the CAPI GEOSVoronoiDiagram usage.
I'm a little concerned about the CI failure, is this rebased to main? I don't really see how your API change could cause the failure, but neither do I want to merge it if it's somehow coming from there....
I think we're ok, the failure was fixed in main at 433b91aa0bba05ce8833f5831a4bd82e8450a9e8
Does this change the return value to always be an actual GeometryCollection, which contains LineStrings if onlyEdges is true?
If so, I am not in favour of this change. One reason is that MultiLineString is more widely supported than a GeometryCollection in the GEOS API (i.e. by spatial predicates and overlay ops). Also, since it is a subclass of GeometryCollection there should be no difference in downstream processing.
Was the stated intention was to eliminate the possibility of returning a single LineString? This is more reasonable., especially since I'm not sure a Voronoi diagram can ever contain a single edge. Although there are other places in the API which can return either a single LineString or a MultiLineString, so does this really need to a special case?
Note that the getNumGeometries and getGeometryN methods are on the Geometry class. This is a deliberate difference to the SFS standard API, and was done to allow more uniform handling of collections and singleton geometries.
@pramsey the failure was in main at the time indeed, so I'll rebase
@dr-jts, both options were considered (and commits for both options are available to be discussed in the branch). I'd be happy to rollback to the previous one.
For the reasoning behind it, I'm not directly using GEOS C++ API, but GEOS C API to maintain its Ruby bindings (RGeo)
As I have seen in the Shapely implementation of voronoi diagram, and as I see now for RGeo, the fact that this method can return this wide range of types is confusing. And the single LineString is even more, as in more dynamic languages where you can directly iterate over the result, it would be counter intuitive to offer iteration over a single geometry.
I'll rollback to the initial proposition of making sure we at least have a collection. Tell me if that works for you or not.
EDIT: You'll find two commits with the two proposal, and the commit messages indicating the reasoning behind. Hopefully the latter is ok with you
The MultiLineString option looks good to me.
The C API doc needs some work to describe the two potential results. Feel free to take a crack at this, or it can be fixed up later.
To me it feels like maybe there should be a separate C API function to return Voronoi edges (which is how this is exposed in PostGIS). Maybe that horse has left the barn, though.
To me it feels like maybe there should be a separate C API function to return Voronoi edges (which is how this is exposed in PostGIS). Maybe that horse has left the barn, though.
@dr-jts yeah I felt exactly the same way when implementing. Since lots of libraries depend on GEOS CAPI, I did not dare.
I'll try and and fix the documentation, but I'm not sure I'd be the best at this. I'll try tomorrow, if it is not enough, I'll revert so you can merge this before it takes too long. What do you think?
I'll try and and fix the documentation, but I'm not sure I'd be the best at this. I'll try tomorrow, if it is not enough, I'll revert so you can merge this before it takes too long. What do you think?
Sounds good!
Another reason for sticking with MultiLineString as the return type for edges is that it's non-breaking change. Switching to a GeometryCollection might break API.
Another reason for sticking with
MultiLineStringas the return type for edges is that it's non-breaking change. Switching to aGeometryCollectionmight break API.
It is still breaking since it doesn't return a LineString anymore. Although this could arguably be considered a bug fix.
And per documentation, returning a GeometryCollection would actually be already, since that is what is written.
Anyway, I'm fooling around. The MultiLineString choices seems practical, and will be easy enough to explain in RGeo
@pramsey @dr-jts is there any blocker left?
@dr-jts @dbaston @pramsey Is this okay to commit or something else needs to be done?