st_make_valid() shifts coordinates
Describe the bug By some reason st_make_valid() shifts coordinates of the original polygons. I also attached a zip with a polygon layer for a demonstration
To Reproduce library(sf) vec <- st_read("Desktop/test/vec.gpkg") plot(vec$geom, xlim = c(39.089111, 39.089113), ylim = c(46.763430, 46.763434)) vec2 <- st_make_valid(vec) plot(vec2$geom, add = T, border = "red")
Which gives:
Matrix products: default BLAS/LAPACK: /usr/lib/libopenblas.so.0.3; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=C LC_TIME=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8
[5] LC_MONETARY=ru_RU.UTF-8 LC_MESSAGES=ru_RU.UTF-8 LC_PAPER=ru_RU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Moscow tzcode source: system (glibc)
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] sf_1.0-14 magrittr_2.0.3
loaded via a namespace (and not attached):
[1] s2_1.1.4 utf8_1.2.3 R6_2.5.1 tidyselect_1.2.0 e1071_1.7-13 glue_1.6.2 tibble_3.2.1
[8] KernSmooth_2.23-21 pkgconfig_2.0.3 generics_0.1.3 dplyr_1.1.2 wk_0.7.3 lifecycle_1.0.3 classInt_0.4-9
[15] cli_3.6.1 fansi_1.0.4 grid_4.3.1 vctrs_0.6.3 DBI_1.1.3 textshaping_0.3.6 proxy_0.4-27
[22] systemfonts_1.0.4 class_7.3-22 compiler_4.3.1 rstudioapi_0.15.0 tools_4.3.1 ragg_1.2.5 pillar_1.9.0
[29] Rcpp_1.0.11 rlang_1.1.1 units_0.8-2
sf::sf_extSoftVersion()
GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H PROJ
"3.11.2" "3.7.0" "9.2.1" "true" "true" "9.2.1"
Please attach the file you used.
It seems that this only applies to s2 engine?
vec <- st_read("vec.gpkg")
vec <- st_transform(vec, crs = 3857) # convert to planar CRS
vec2 <- st_make_valid(vec)
identical(st_coordinates(vec)[, 1:2], st_coordinates(vec2)[, 1:2])
#> [1] TRUE
Yes, after setting sf_use_s2(FALSE) you also get identical results.
Digging into this I got the feeling that something happens inside wk:
library(sf)
# Linking to GEOS 3.11.1, GDAL 3.6.4, PROJ 9.1.1; sf_use_s2() is TRUE
vec <- st_read("vec.gpkg")
# Reading layer `vec' from data source `/home/edzer/Downloads/test/vec.gpkg' using driver `GPKG'
# Simple feature collection with 2 features and 0 fields
# Geometry type: MULTIPOLYGON
# Dimension: XY
# Bounding box: xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS: WGS 84
geom = st_geometry(vec)
geom
# Geometry set for 2 features
# Geometry type: MULTIPOLYGON
# Dimension: XY
# Bounding box: xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS: WGS 84
# MULTIPOLYGON (((39.09903 46.76268, 39.0974 46.7...
# MULTIPOLYGON (((39.08823 46.75861, 39.08754 46....
geom |> st_as_binary() |> st_as_sfc(crs = st_crs(geom)) -> geom2
identical(geom, geom2)
# [1] TRUE
geom |> st_as_binary() |> s2::as_s2_geography()
# <geodesic s2_geography[2] with CRS=OGC:CRS84>
# [1] POLYGON ((39.0980983 46.7645449, 39.0967768 46.7658228, 39.0918796 46.7666747, 39.089483 46.763648, 39.089112 46.763434...
# [2] POLYGON ((39.0891125 46.7634337, 39.088777 46.7632976, 39.0884805 46.7631988, 39.0882671 46.763073, 39.0880037 46.7629202...
geom |> st_as_binary() |> s2::as_s2_geography() |> st_as_sfc(crs=st_crs(geom))
# Geometry set for 2 features
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS: WGS 84
# POLYGON ((39.0981 46.76454, 39.09678 46.76582, ...
# POLYGON ((39.08911 46.76343, 39.08878 46.7633, ...
@paleolimbot could you take a look? The problem starts with the first longitude of the first polygon node.
@edzer I think you are observing reordering that occurs because oriented = FALSE by default?
I wonder if the default snap precision (7 decimal places) is causing the move here:
sf:::st_make_valid.sfc
#> function (x, ..., oriented = FALSE, s2_options = s2::s2_options(snap = s2::s2_snap_precision(1e+07),
#> ...), geos_method = "valid_structure", geos_keep_collapsed = TRUE)
#> ...
#> <bytecode: 0x10574ce28>
#> <environment: namespace:sf>
If I missed the mark on both of these, perhaps one of you could modify the following reprex and I'll run it through the debugger? There are some mildly complex things that happen to support tesselating planar edges into spherical ones and it's not impossible that something about these points triggers something I or the s2geometry authors didn't consider.
lng <- c(39.0990310770001, 39.0973986580001, 39.0900920320001, 39.0990310770001)
lat <- c(46.7626812080001, 46.7577288260001, 46.7584910210001, 46.7626812080001)
xy <- wk::xy(lng, lat, crs = wk::wk_crs_longlat())
poly <- wk::wk_polygon(xy)
(sfc <- sf::st_as_sfc(poly))
#> Geometry set for 1 feature
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 39.09009 ymin: 46.75773 xmax: 39.09903 ymax: 46.76268
#> Geodetic CRS: WGS 84
#> POLYGON ((39.09903 46.76268, 39.0974 46.75773, ...
# Using oriented = TRUE to avoid reordering
(geog <- s2::as_s2_geography(sfc, oriented = TRUE))
#> <geodesic s2_geography[1] with CRS=OGC:CRS84>
#> [1] POLYGON ((39.0990311 46.7626812, 39.0973987 46.7577288, 39.090092 46.758491, 39.0990311 46.7626812))
(sfc2 <- sf::st_as_sfc(geog))
#> Geometry set for 1 feature
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 39.09009 ymin: 46.75773 xmax: 39.09903 ymax: 46.76268
#> Geodetic CRS: WGS 84
#> POLYGON ((39.09903 46.76268, 39.0974 46.75773, ...
waldo::compare(
sfc |> sf::st_set_crs(NA) |> wk::wk_vertices() |> wk::as_wkt(),
sfc2 |> sf::st_set_crs(NA) |> wk::wk_vertices() |> wk::as_wkt()
)
#> unclass(old) vs unclass(new)
#> - "POINT (39.0990310770001 46.7626812080001)"
#> + "POINT (39.0990310770001 46.76268120800011)"
#> "POINT (39.0973986580001 46.7577288260001)"
#> "POINT (39.0900920320001 46.7584910210001)"
#> - "POINT (39.0990310770001 46.7626812080001)"
#> + "POINT (39.0990310770001 46.76268120800011)"
Created on 2023-08-15 with reprex v2.0.2
I wonder if the default snap precision (7 decimal places) is causing the move here:
Indeed, increasing the precision helps.
st_make_valid(st_as_sfc(vec), s2_options = s2::s2_options(snap = s2::s2_snap_precision(1e+09)))
makes the lines coincide (but the coordinates are slightly different).
BTW: Setting the precision too high (i.e. 1e+18) and then run st_make_valid() crashes RStudio.