duckdb_spatial icon indicating copy to clipboard operation
duckdb_spatial copied to clipboard

can't open a geojson.gz file

Open ericemc3 opened this issue 5 months ago • 5 comments

FROM st_read('https://object.data.gouv.fr/contours-administratifs/2024/geojson/departements-100m.geojson.gz') ;

=> SQL Error: Not implemented Error: GZipFileSystem: GetLastModifiedTime is not implemented!

ericemc3 avatar Aug 01 '25 10:08 ericemc3

Another runnable example where I could reproduce this is here:

from st_read('https://github.com/hrbrmstr/albersusa/raw/refs/heads/master/inst/extdata/composite_us_counties.geojson.gz');

If I download the file, I'm able to read it locally with this approach:

from st_read('/vsigzip/composite_us_counties.geojson.gz');

I also have similar issues trying to access it from S3.

Alex-Monahan avatar Aug 13 '25 00:08 Alex-Monahan

Hello! This is possible using GDAL's virtual filesystem. E.g.

FROM st_read('/vsigzip//vsicurl/https://object.data.gouv.fr/contours-administratifs/2024/geojson/departements-100m.geojson.gz') ;


┌─────────┬──────────────────────┬─────────┬────────────────────────────────────────────────────────────────────────────────────────────────┐
│  code   │         nom          │ region  │                                              geom                                              │
│ varchar │       varchar        │ varchar │                                            geometry                                            │
├─────────┼──────────────────────┼─────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 01      │ Ain                  │ 84      │ POLYGON ((5.8251 45.9386, 5.8292 45.9383, 5.8263 45.9361, 5.8227 45.9307, 5.8226 45.927, 5.8…  │
│ 02      │ Aisne                │ 32      │ POLYGON ((3.9867 49.379, 3.9772 49.3795, 3.9725 49.3794, 3.9615 49.3773, 3.9556 49.3819, 3.9…  │
│ 03      │ Allier               │ 84      │ POLYGON ((2.6778 46.7045, 2.6759 46.7068, 2.6714 46.7088, 2.6794 46.7128, 2.6835 46.7133, 2.…  │
│ 04      │ Alpes-de-Haute-Pro…  │ 93      │ POLYGON ((6.7283 44.2526, 6.724 44.2499, 6.7257 44.2426, 6.7239 44.2406, 6.7235 44.2373, 6.7…  │
│ 05      │ Hautes-Alpes         │ 93      │ POLYGON ((6.8834 44.8481, 6.8883 44.8473, 6.892 44.8482, 6.8953 44.8478, 6.8981 44.8485, 6.9…  │
│ 06      │ Alpes-Maritimes      │ 93      │ MULTIPOLYGON (((6.8323 43.9185, 6.8331 43.9156, 6.8357 43.9157, 6.8416 43.9163, 6.8473 43.91…  │
│ 07      │ Ardèche              │ 84      │ POLYGON ((4.8091 45.2589, 4.8075 45.2556, 4.8026 45.2486, 4.8015 45.2423, 4.8028 45.2278, 4.…  │
│ 08      │ Ardennes             │ 44      │ POLYGON ((5.0111 49.2694, 4.9976 49.2642, 5 49.2607, 4.9994 49.2594, 4.9935 49.2571, 4.9897 …  │
│ 09      │ Ariège               │ 76      │ POLYGON ((0.8599 42.8383, 0.8569 42.8397, 0.857 42.8431, 0.8529 42.8498, 0.853 42.8515, 0.85…  │
│ 10      │ Aube                 │ 44      │ POLYGON ((3.7455 48.1675, 3.7382 48.1703, 3.7299 48.1718, 3.7185 48.1756, 3.7138 48.174, 3.7…  │
│ 11      │ Aude                 │ 76      │ POLYGON ((2.8948 43.3262, 2.9007 43.3247, 2.904 43.3229, 2.9065 43.3227, 2.9104 43.3201, 2.9…  │
│ 12      │ Aveyron              │ 76      │ POLYGON ((1.9165 44.4863, 1.9192 44.4879, 1.9211 44.4907, 1.9206 44.492, 1.9124 44.4989, 1.9…  │
│ 13      │ Bouches-du-Rhône     │ 93      │ MULTIPOLYGON (((4.8416 43.3337, 4.835 43.3287, 4.8325 43.3307, 4.8228 43.3362, 4.8139 43.339…  │




Maxxen avatar Aug 18 '25 11:08 Maxxen

Oh, thank you! Actually, I know the trick, but the lay people I work with don’t — and they won’t remember it ;) DuckDB’s promise is that each file should be straightforward to access, isn’t it?

ericemc3 avatar Aug 18 '25 19:08 ericemc3

Sure! Ideally we dont need the vsi stuff soon now that we have pur own http cache, will investigate further!

Maxxen avatar Aug 18 '25 19:08 Maxxen

On 1.4.2 I am not able to get any '/vsicurl/' path to work, it simply says the file doesn't exist on the system. The same path works with 'ogrinfo' on the command line. Even a plain (uncompressed) shapefile doesn't work. I am at a loss..

CREATE TABLE gdal_http_test_1 AS
     FROM ST_Read('/vsigzip//vsicurl/https://object.data.gouv.fr/contours-administratifs/2024/geojson/departements-100m.geojson.gz');
IO Error:
GDAL Error (4): `/vsigzip//vsicurl/https://object.data.gouv.fr/contours-administratifs/2024/geojson/departements-100m.geojson.gz' does not exist in the file system, and is not recognized as a supported dataset name.

A simpler example (from the GDAL vsi docs) with no gzip:

CREATE TABLE vsicurl_test_1 AS
    FROM ST_Read('/vsicurl/https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp');
IO Error:
GDAL Error (4): `/vsicurl/https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp' does not exist in the file system, and is not recognized as a supported dataset name.

Accessing and loading Shapefiles over HTTPs in general (without using the GDAL virtual files) works:

CREATE OR REPLACE TABLE non_vsicurl_test_1 AS
    FROM ST_Read('https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp');

or without using GDAL at all:

CREATE TABLE non_vsicurl_test_1 AS
    FROM ST_ReadSHP('https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp');

justin0mcateer avatar Nov 15 '25 00:11 justin0mcateer