sf icon indicating copy to clipboard operation
sf copied to clipboard

st_write support GeoParquet

Open arthurgailes opened this issue 10 months ago • 14 comments

As of 1.0-19, writing (geo)parquet files doesn't seem to work. It's supported in GDAL, but not included in st_drivers(). This issue is mentioned in an old thread https://github.com/r-spatial/sf/issues/1756, but I'm not sure what the status is.

st_read(system.file("shape/nc.shp", package="sf")) |> 
  st_write("test.parquet", driver = "GeoParquet")

driver `GeoParquet' not available.
Error: Driver not available.

Thanks for your time.

arthurgailes avatar Feb 13 '25 13:02 arthurgailes

Please report your operating system and how you installed sf. Not all GDAL installations are built with support for the same sets of drivers; this is not under the control of sf.

rsbivand avatar Feb 13 '25 13:02 rsbivand

It seems the driver name is Parquet, not GeoParquet. (edit: this result is on my Manjaro Linux, which installed GDAL and Arrow from the distro's official repository, and built sf from source.)

sf::st_drivers() |>
  dplyr::filter(name == "Parquet")
#>            name    long_name write  copy is_raster is_vector  vsi
#> Parquet Parquet (Geo)Parquet  TRUE FALSE     FALSE      TRUE TRUE

yutannihilation avatar Feb 13 '25 14:02 yutannihilation

Just in case this is useful, I found no Parquet driver on my Windows. It seems Rtools' GDAL (v3.10.1 at the latest) doesn't have support for GeoParquet.

yutannihilation avatar Feb 13 '25 14:02 yutannihilation

@yutannihilation the OS and how sf and GDAL were installed also matter apart from spelling.

rsbivand avatar Feb 13 '25 14:02 rsbivand

Sorry, I added these information on my first comment! It was the result on my Manjaro Linux laptop.

yutannihilation avatar Feb 13 '25 14:02 yutannihilation

@yutannihilation Thanks for your help! As far as I know, the CRAN Windows and macOS static binary builds omit the GDAL vector "Parquet" and "Arrow" drivers. Different Linux package management systems may choose to include or exclude the upstream software needed to include these drivers.

rsbivand avatar Feb 13 '25 15:02 rsbivand

Oh, good to know the situation about Windows and macOS. Thanks for sharing the details!

yutannihilation avatar Feb 13 '25 15:02 yutannihilation

Please report your operating system and how you installed sf. Not all GDAL installations are built with support for the same sets of drivers; this is not under the control of sf.

Present on both Windows 10 and Windows server 2016. Installed via install.packages.

arthurgailes avatar Feb 13 '25 16:02 arthurgailes

@arthurgailes The current sf Windows static binary package 1.0-19) (built with current Rtools44 (6414) does not include the Parquet or Arrow drivers. There are traces in the sf source of attempts to use nanoarrow, to read through a stream interface, but I think that this is not user-ready (not tried).

rsbivand avatar Feb 14 '25 09:02 rsbivand

Here was similar thread in terra with responses from Dewey and Tomas: https://github.com/rspatial/terra/issues/1347

kadyb avatar Feb 14 '25 12:02 kadyb

@kadyb thanks, very useful!

rsbivand avatar Feb 14 '25 13:02 rsbivand

With my ubuntu 24.04 and ubuntugis-unstable ppa I couldn't read geoparquet files, but in a docker image with this Dockerfile I could.

edzer avatar Feb 14 '25 17:02 edzer

This seems to work quickly following the example at https://github.com/geoarrow/geoarrow-r if googlers are looking for a workaround

kendonB avatar Mar 05 '25 19:03 kendonB