[GH-1996] Implement S2Geography support on Apache Sedona
PR links
- Point, PolyLine, Polygon implementation https://github.com/apache/sedona/pull/1992
(This version of implementation is a mimic of original C++ from https://github.com/paleolimbot/s2geography, future optimization may be required) The S2Geography framework provides a unified API for handling spherical geometries—points, polylines, polygons and multipolygons—by defining an abstract base class, S2Geography, and six concrete subclasses:
- PointGeography
- PolylineGeography
- PolygonGeography
- GeographyCollection
- ShapeIndexGeography
- EncodedShapeIndexGeography
Each subclass implements a consistent set of core operations:
- dimension()
- numShapes()
- shape(int index)
- region()
- encoding and decoding via its dedicated encoder/decoder
On top of these primitives, S2Geography supports:
- Construction of any geography type
- Reading/Writing from and to WKT and WKB formats
- Indexing for all S2geography objects
- Projections between coordinate systems
- Spatial operations such as: 1) distance computation, 2) equality testing, 3) intersection checks, 4) containment tests, and 5) within-predicate evaluations
Thank you for your interest in Apache Sedona! We appreciate you opening your first issue. Contributions like yours help make Apache Sedona better.
@ZhuochengShang thanks. Can you add a bit more description of this project here? Like what do we want to implement in the S2geography project?
I reviewed the s2geography codebase we are porting, along with the related PR. There are several fundamental issues with the current approach to supporting geography queries:
- s2geography implements geography predicates (intersects, containment checks, etc.) and distance computation by constructing
S2ShapeIndexfor various kinds of geography objects, and invokesS2BooleanOperationorS2ClosestEdgeQueryto do the heavy lifting. These algorithms were not ported to the latestcom.google.geometry:s2-geometry:2.0.0release. - The master branch of google/s2-geometry-library-java has these algorithms ported from C++ to Java, but there's no official release yet (https://github.com/google/s2-geometry-library-java/issues/40).
Porting S2 algorithms we need such as S2BooleanOperation to our S2Geography implementation seems to be redundant effort, the ideal way is depending on a new release of the official s2-geometry java library. There might be breaking changes in the master branch, so this may affect other more fundamental works including basic Geography type implementations, WKT/WKB support, encoding/decoding, etc.
CC @paleolimbot
Porting S2 algorithms we need such as
S2BooleanOperationto our S2Geography implementation seems to be redundant effort, the ideal way is depending on a new release of the official s2-geometry java library. There might be breaking changes in the master branch, so this may affect other more fundamental works including basic Geography type implementations, WKT/WKB support, encoding/decoding, etc.
Update: we decided to release a fork of the latest s2-geometry-library-java to maven central under org.datasyslab group, this will unblock our implementation of S2Geography module. We also need to shade this library into sedona-common module to avoid any conflicts with other s2 geometry jars in the runtime.
Thanks for the ping!
I haven't followed the developments in Java but in C++ the release cadence is very slow. The S2BooleanOperation is pretty critical to the functions that people actually care about and is complex/difficult to otherwise implement (depending on what else was already in Java). This approach seems great!