sqlite-tg icon indicating copy to clipboard operation
sqlite-tg copied to clipboard

GeoPackage support

Open rcoup opened this issue 2 years ago • 4 comments

Here's a really rough cut at GeoPackage support

Why? GeoPackage is missing a lightweight implementation IMO: libgpkg isn't maintained, Spatialite is really complex and includes all sorts of other stuff; GDAL's implementation is internal.

What does this do right now?

  • detects and parses GeoPackage geometries to TG geometries
  • implements functions as required by the spec for updating spatial indexes: ST_IsEmpty, ST_MinX, ST_MaxX, ST_MinY, ST_MaxY
  • outputs geopackage geometries via tg_to_gpkg()
  • check whether a DB is a geopackage via tg_is_geopackage_db()

Demo

$ sqlite3 sample1_2.gpkg
sqlite> .load ./dist/tg0
sqlite> PRAGMA trusted_schema=1;

sqlite> SELECT tg_is_geopackage_db();
tg_is_geopackage_db()
---------------------
1

sqlite> SELECT objectid, name, tg_to_geojson(shape) FROM counties WHERE objectid=1;
OBJECTID  NAME               tg_to_geojson(shape)
--------  -----------------  ------------------------------------------------------------
1         Lake of the Woods  {"type":"MultiPolygon","coordinates":[[[[-95.34283129299996,
                             48.54667929400006],[-95.21983979799995,48.54435776100007],[-
                             95.21178801599996,48.36900473700007],[-94.43169004799995,48.
                             36821245400006],[-94.43063447299994,48.71078528100003],[-94.
                             57031273699994,48.71367626400007],[-94.69443202199994,48.777
                             61553000005],[-94.68124994599998,48.877161311000066],[-94.83
                             203926099998,49.33080591700008],[-95.15186733499996,49.37173
                             0144000026],[-95.15774988899994,48.999995915000056],[-95.276
                             65711499998,48.999991212000054],[-95.31012057999999,48.99339
                             545600003],[-95.32323587299999,48.97895633500008],[-95.32091
                             643199999,48.96097701200006],[-95.30375729499997,48.94593892
                             000006],[-95.31417174699999,48.93207200400008],[-95.29026019
                             099996,48.902949584000055],[-95.21957849499995,48.8794465170
                             00076],[-95.13382122699994,48.89448477500008],[-95.094910358
                             99998,48.91176241000005],[-95.09435904099996,48.717357531000
                             06],[-95.34105289899998,48.71517198200007],[-95.342831292999
                             96,48.54667929400006]]]]}

sqlite> UPDATE counties SET shape = tg_to_gpkg('{
    "type": "Polygon",
    "coordinates": [[[30.0, 10.0], [40.0, 40.0], [20.0, 40.0], [10.0, 20.0], [30.0, 10.0]]]
}') WHERE objectid=1;

sqlite> SELECT objectid, name, tg_to_geojson(shape) FROM counties WHERE objectid=1;
OBJECTID  NAME               tg_to_geojson(shape)
--------  -----------------  ------------------------------------------------------------
1         Lake of the Woods  {"type":"Polygon","coordinates":[[[30,10],[40,40],[20,40],[1
                             0,20],[30,10]]]}

Where to next?

Obviously you have some more you want to do. With respect to GeoPackage support (if you're interested in adopting it):

  1. I wonder if there's a way to avoid the tg_to_gpkg() method for UPDATEs & INSERTs. One ideais to go into some sort of auto-geopackage mode where the extension operates natively using GPKG geometries rather than TG geometries.
  2. Probably needs some lightweight functions for setting up geopackage tables with indexes/triggers.

rcoup avatar Sep 26 '23 21:09 rcoup

Hey @rcoup , thanks for the PR! I was initially hesitant in adding other formats that tg doesn't support, but I hadn't realized that it would be this easy to support gpkg-style geometries.

I intend to merge this PR in the near future, but there's a few things I want to finish up before getting it in.

A few things:

  1. Do you have any links to documentation/references of the geopackage blob format? I tried looking but got confused pretty quick
  2. I wanted sqlite-tg functions to be prefixed with tg_ to clearly show they come from sqlite-tg, so I'm a bit hesistant about the new ST_* functions. Does geopackage require they call them that? I think we can add a new entrypoint to the extension so make those functions opt-in, something like .loadExtension("tg0", "tg_include_st") or something
  3. Are there any other geopackage references/blog posts I can read to learn more about it?

asg017 avatar Sep 30 '23 19:09 asg017

I intend to merge this PR in the near future, but there's a few things I want to finish up before getting it in.

I think it needs a bit more work. Certainly needs tests :-)

  1. Do you have any links to documentation/references of the geopackage blob format?

Yes. http://www.geopackage.org/spec/#gpb_format is the specification for the binary geometry format. Essentially it's standard OGC WKB with a header that includes a SRID, an optional bounding box, and an is-empty flag.

  1. I wanted sqlite-tg functions to be prefixed with tg_ to clearly show they come from sqlite-tg, so I'm a bit hesistant about the new ST_* functions. Does geopackage require they call them that?

So GeoPackage uses SQLite's RTree indexes to build spatial indexes populated with geometry bounding boxes. To update them, they set up triggers which writes to the indexes when you update the table rows. And those trigger definitions call some ST_* functions to get the envelopes, which are part of the ISO SQL/MM Simple Features for SQL specifications that PostGIS/etc implement. Because those names (and their behaviours) are defined by the standards, so ST_Intersects() should mean the same thing in PostGIS, Oracle Spatial, Spatialite, etc.

I think we can add a new entrypoint to the extension so make those functions opt-in, something like .loadExtension("tg0", "tg_include_st") or something

Personally if I have a GeoPackage I just want it to work with the minimum of fuss. Same names, same definitions, etc :-) But yes, loading a different extension would be fine. tg_geopackage? That way if you loaded tg_geopackage you could just always use GPKG binary geometries rather than TG ones, which would probably make things simpler all round.

PostGIS uses ST_* for everything including postgis-only functions, I think other databases do too. Personally I'd be fine with "tg_* means this is a TG function and "st_* means this is a standard function".

Are there any other geopackage references/blog posts I can read to learn more about it?

geopackage.org has a fair number of links. And switchfromshapefile.org :wink: Happy to dig up reading on specifics.

Happy to answer more questions,

rcoup avatar Oct 02 '23 10:10 rcoup

/cc @simonw @ianturton @bdon any ideas/comments?

rcoup avatar Oct 02 '23 10:10 rcoup

FYI, functions need to be tagged SQLITE_INNOCUOUS — some distributions of SQLite are now compiling with trusted_schema=off, and users will need to do PRAGMA trusted_schema=1; to enable non-innocuous function calls.

https://github.com/OSGeo/gdal/issues/8514

rcoup avatar Oct 04 '23 10:10 rcoup