duckdb_spatial
duckdb_spatial copied to clipboard
Feature: Native WKT reader/writer
We currently rely on GEOS for writing and reading WKT, which requires an unnecessary copy and allocation step to convert back and forth between DuckDB and GEOS geometries.
We should implement our own WKT parser/writer that can use arena allocation.
Since we also only support of a subset of WKT (no extended WKT) this would also give us more control and ensure we don't accidentally import geometries we can't support - basically throw at the parsing step.
Does the above mean there is a way to convert WKT strings into a GEOMETRY or WKB_BLOB type already?
D DESCRIBE FROM read_ndjson_auto('shapes_all_low.jsonl') LIMIT 1;
┌─────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐
│ column_name │ column_type │ null │ key │ default │ extra │
│ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │
├─────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤
│ geoname_id │ UBIGINT │ YES │ │ │ │
│ wkt │ VARCHAR │ YES │ │ │ │
└─────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘
D select wkt::GEOMETRY FROM read_ndjson_auto('shapes_all_low.jsonl') LIMIT 1;
libc++abi: terminating with uncaught exception of type duckdb::NotImplementedException: Not implemented Error: Deserialize::Geometry type not implemented yet!
Abort trap: 6
D select st_astext(wkt) FROM read_ndjson_auto('shapes_all_low.jsonl') LIMIT 1;
Error: Binder Error: No function matches the given name and argument types 'ST_AsText(VARCHAR)'. You might need to add explicit type casts.
Candidate functions:
ST_AsText(POINT_2D) -> VARCHAR
ST_AsText(LINESTRING_2D) -> VARCHAR
ST_AsText(POLYGON_2D) -> VARCHAR
ST_AsText(GEOMETRY) -> VARCHAR
My dataset is a mix of geometry types.
D select SPLIT_PART(wkt, ' ', 1), COUNT(*) from read_ndjson_auto('shapes_all_low.jsonl') GROUP BY 1 ORDER BY 2 DESC;
┌─────────────────────────┬──────────────┐
│ split_part(wkt, ' ', 1) │ count_star() │
│ varchar │ int64 │
├─────────────────────────┼──────────────┤
│ MULTIPOLYGON │ 196 │
│ POLYGON │ 53 │
└─────────────────────────┴──────────────┘
Yes! "WKT" and "Text" is the same thing. Use ST_GeomFromText(VARCHAR). You can pass an 'ignore_invalid := bool flag too to get nulls instead of error should the geom be invalid.
Okay, thanks.