CoordinateSequence: create using external buffer
This PR adds methods to create a CoordinateSequence backed by an external buffer. Currently this can only be done for XYZ or XYZM sequences until GEOS algorithms are made "2d-safe."

Overall this change causes no performance regression, and a small gain in a few cases.
Nice! Would you consider an equivalent for x/y/... stored in seperate buffers? (Ie the qgis model)
Not really... the only reason this works is because the external buffer is exactly the same shape as the internal buffer.
Currently this can only be done for XYZ or XYZM sequences until GEOS algorithms are made "2d-safe."
What needs to be done to make things "2d-safe"?
What needs to be done to make things "2d-safe"?
Need to stop reading 3D coordinates (Coordinate) by reference without checking whether the sequence is actually 3D or 4D. Not a small task!
Can we mostly start reading CoordinateXY by reference instead?
@pramsey
Not really... the only reason this works is because the external buffer is exactly the same shape as the internal buffer.
Fair enough -- this might be motivation enough for me to rework how QGIS internally stores these arrays!
Can we mostly start reading CoordinateXY by reference instead?
It's always safe to read CoordinateXY&, though the assumption that we have a Coordinate& shows up in a surprising number of places.
Simplification and triangulation are good examples...they're 2D algorithms, but they also have this assumption that we're going to preserve z values from the input.
Here's another test I did:
- Removed the part of this PR that copies an external XY buffer to XYZ. This should give best case performance but could crash if the sequence is used in parts of the code that assume 3d coordinates.
- Updated PostGIS so that GEOS coordinate sequences reference the memory in a
POINTARRAYwith no copies - Ran
ST_IsValidon all polygons in 10 copies of the GADM country boundaries dataset (328 million vertices total), withSTORAGE EXTERNAL
When geometries are copied into GEOS, this takes 100.58 seconds. When geometries are created with no coordinate copies, this takes 100.55 seconds. Not a big win for zero-copy!
A second test of a self-join with ST_Intersects takes 80 seconds when coordinates are copied and 74 seconds when the coordinates are reference. Maybe a 7.5% gain is worthwhile?
About a 10% gain on ST_Centroid. I have never been too worried about ST_Centroid as a bottleneck, but maybe better performance for very fast functions could still be useful for clients like GDAL that don't have all of the PostGIS-style short circuits. Calling GEOSIntersection on geometries that are likely to be disjoint, for example. (Although GEOSIntersection currently requires padded inputs)
To safely apply these changes, we'd need to have C API functions that require padded coordinates check their inputs and automatically copy into a new padded coordinate sequence. (This is easy to do with something like seq->reserve(seq->size()).)
As a PostGIS guy, this leaves me cold. The only win is for use cases where they are wrapping and using things like Area() and Length() and Distance() frequently, which... yeah it happens, but as your testing shows, for the "that's too hard to write myself, I need GEOS to do it" use cases, there's no win. I'd vote for GEOS simplicity and letting people write their own easy calculations. And honestly, of course the Rust guys are going to write Rust implementations of Area and Length and Distance! Purity! We saw the same thing with Turf.js and JSTS.