Feature Request: ST_AsEWKB() and ST_AsEWKT()
Summary
The DuckDB Spatial extension currently supports ST_AsText() (WKT) and ST_AsWKB() (WKB), but lacks support for Extended WKT (EWKT) and Extended WKB (EWKB) — formats which preserve essential metadata like SRID and are required for full standards-compliant geometry exchange with spatial databases like PostGIS, SpatiaLite, and others.
This feature request proposes implementing native support for:
- ST_AsEWKT(geometry) → returns a string with full EWKT (including SRID)
- ST_AsEWKB(geometry) → returns a BLOB/BYTEA with full EWKB (including endian flag and SRID)
Motivation
DuckDB is increasingly used as an ETL engine, and the DuckDB Spatial extension makes it attractive for loading and transforming geospatial data. However, there is currently no safe or standards-compliant way to transfer geometries — especially those with an SRID — into systems like PostGIS without fragile workarounds.
Current limitations:
- ST_AsText() drops SRID and produces WKT only — incompatible with EWKT-consuming systems.
- ST_AsWKB() similarly drops SRID, producing WKB that can be misinterpreted by consumers expecting EWKB.
- Direct inserts into PostGIS via ATTACH postgres can fail with endian errors (Invalid endian flag value encountered) due to platform-specific binary serialization.
This leads to fragile and verbose pipelines where the user must:
- Manually append SRID as a string (for WKT)
- Create staging tables with BYTEA columns
- Reapply SRID using ST_SetSRID(ST_GeomFromWKB(...)) inside PostGIS
All of this could be avoided by simply producing EWKT or EWKB, which are:
- Explicitly designed for inter-system geometry exchange
- Understood by PostGIS, SpatiaLite, Oracle Spatial, GeoServer, GDAL/OGR, and more
- Cross-platform safe, due to explicit byte-ordering and metadata inclusion
Proposal
Implement two additional spatial functions:
ST_AsEWKT(geometry)
→ Returns EWKT string, e.g., 'SRID=4326;POINT(1 1)'
ST_AsEWKB(geometry)
→ Returns EWKB binary with endian flag, SRID, geometry type and coordinates
Both functions would match the semantics of PostGIS’s equivalents and follow OGC standards. Benefits
- Native, robust ETL support for PostGIS and other geospatial databases
- Avoids platform-specific endian issues with binary geometry transfer
- Enables safe use of ATTACH postgres for direct geometry inserts
- Eliminates need for staging logic, manual SRID reapplication, or text parsing hacks
- Brings DuckDB Spatial closer to full interoperability with modern GIS tooling
Example Use Case
-- In DuckDB
INSERT INTO postgres_conn.public.spatial_table
SELECT ST_AsEWKB(geom), name, category
FROM ST_Read('data/source_layer.gdb');
-- In PostGIS (geom is a geometry column) -- PostGIS natively parses the EWKB and applies SRID, no manual parsing needed
Conclusion
Support for ST_AsEWKT() and ST_AsEWKB() would be a simple but powerful addition that enables DuckDB to be a truly standards-compliant spatial ETL engine. It would eliminate common pitfalls and open the door to seamless interoperation with a wide variety of geospatial systems — not just PostGIS.
Point me to where I can help making this happen. My C++ skills back then from college are unfortunately quite poor, but I am eager to learn and help whereever possible
Hello!
The reason this isn't implemented is that DuckDB doesn't store SRID's for individual geometry values anyway (and probably never will). Although I guess you could make a ST_AsEWKB(geometry, int) function where the user passes the SRID explicitly. But since DuckDB doesn't have a SPATIAL_REF_SYS table where the SRID definition can be looked up so the usage here is a bit limited.
@Maxxen Well, I get that. Nevertheless I would assume that DuckDB wouldn't need an explicit SPATIAL_REF_SYS table as this isn't used in everydays life of a spatial data engineer anyway. When talking about SRID's we 99% mean CRS in the form of EPSG Codes you can look up elsewhere if you need to. And like in ST_Transform DuckDB already makes use of these to perform a quite heavy operation backed by PROJ.4
What I am proposing here is just a way of a canonical encoding of geometries together with their core metadata (SRID in this case) to make it easy to transfer it. For WKT you can workaround this as shown, but for WKB there is no safe way to ensure the endian flag on every platform which leads to nasty pipeline failures in dockered environments.
So yes, a ST_AsEWKB(geometry, int)would already be great
That makes sense! Both the WKT and WKB writing code in spatial is a bit of a mess right now as it hasn't been rewritten to use the new way we do things, but it should be relatively easy to add support attaching SRIDs/outputting EWKB once I get to work on them again.
The reason this isn't implemented is that DuckDB doesn't store SRID's for individual geometry values anyway (and probably never will).
I would ask for considering storing an authority (AUTH_NAME in Simple Features Access) and code (AUTH_SRID) in the geometry. Only supporting EPSG as an authority would be OK. You don't want to use old-style PROJ strings going forward anyway - at least WKT2 (not to be confused with WKT encoding of a geometry), and the OGC will eventually make a CRSJSON thing.
Although I guess you could make a
ST_AsEWKB(geometry, int)function where the user passes the SRID explicitly.
I think tracking it through operations would be friendlier. That also allows sanity checking when doing aggregate operations, which is harder to do when the CRS is in another column.
But since DuckDB doesn't have a
SPATIAL_REF_SYStable where the SRID definition can be looked up so the usage here is a bit limited.
For most users, SRID of 4326 is EPSG:4326. The indirection through spatial_ref_sys isn't that important. For EWKT and EWKB, there is no authority, so if its an interchange, then that is about the only sane assumption you can make.
@bradh The plan is to eventually parameterize the geometry type itself so that projection info is stored at the column-level and not row level, most likely by allowing users to store an arbitrary string - if thats PROJJSON, AUTH:CODE or CODE is up to the user. You could then verify at bind-time that you don't mix-up two different projections. But we need to do more work in DuckDB core first. I teased it a while back on twitter though:
@Maxxen Although most geoformats do in fact store projection-information row-wise so you are technically able to mix different srs in one table, I never ever have seen any example where somebody actually does this. Also client applications need a single SRS info per "Layer" (=table), respectively the geometry column. Therefore I like this draft very much and think thats a quite smart solution to store metadata about the geometry (srs, geometrytype, dimension, etc...) as parameters of the geometry column.