sf icon indicating copy to clipboard operation
sf copied to clipboard

st_make_valid() shifts coordinates

Open erydit opened this issue 2 years ago • 7 comments

Describe the bug By some reason st_make_valid() shifts coordinates of the original polygons. I also attached a zip with a polygon layer for a demonstration

To Reproduce library(sf) vec <- st_read("Desktop/test/vec.gpkg") plot(vec$geom, xlim = c(39.089111, 39.089113), ylim = c(46.763430, 46.763434)) vec2 <- st_make_valid(vec) plot(vec2$geom, add = T, border = "red")

Which gives: Rplot

`sessionInfo()` R version 4.3.1 (2023-06-16) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Manjaro Linux

Matrix products: default BLAS/LAPACK: /usr/lib/libopenblas.so.0.3; LAPACK version 3.11.0

locale: [1] LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=C LC_TIME=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8
[5] LC_MONETARY=ru_RU.UTF-8 LC_MESSAGES=ru_RU.UTF-8 LC_PAPER=ru_RU.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Moscow tzcode source: system (glibc)

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] sf_1.0-14 magrittr_2.0.3

loaded via a namespace (and not attached): [1] s2_1.1.4 utf8_1.2.3 R6_2.5.1 tidyselect_1.2.0 e1071_1.7-13 glue_1.6.2 tibble_3.2.1
[8] KernSmooth_2.23-21 pkgconfig_2.0.3 generics_0.1.3 dplyr_1.1.2 wk_0.7.3 lifecycle_1.0.3 classInt_0.4-9
[15] cli_3.6.1 fansi_1.0.4 grid_4.3.1 vctrs_0.6.3 DBI_1.1.3 textshaping_0.3.6 proxy_0.4-27
[22] systemfonts_1.0.4 class_7.3-22 compiler_4.3.1 rstudioapi_0.15.0 tools_4.3.1 ragg_1.2.5 pillar_1.9.0
[29] Rcpp_1.0.11 rlang_1.1.1 units_0.8-2

sf::sf_extSoftVersion()

      GEOS           GDAL         proj.4 GDAL_with_GEOS     USE_PROJ_H           PROJ 
  "3.11.2"        "3.7.0"        "9.2.1"         "true"         "true"        "9.2.1" 
[test.zip](https://github.com/r-spatial/sf/files/12336862/test.zip)

erydit avatar Aug 14 '23 15:08 erydit

Please attach the file you used.

edzer avatar Aug 14 '23 16:08 edzer

Please attach the file you used.

Sorry, I thought it has been attached: test.zip

erydit avatar Aug 14 '23 16:08 erydit

It seems that this only applies to s2 engine?

vec <- st_read("vec.gpkg")
vec <- st_transform(vec, crs = 3857) # convert to planar CRS
vec2 <- st_make_valid(vec)
identical(st_coordinates(vec)[, 1:2], st_coordinates(vec2)[, 1:2])
#> [1] TRUE

kadyb avatar Aug 15 '23 10:08 kadyb

Yes, after setting sf_use_s2(FALSE) you also get identical results.

edzer avatar Aug 15 '23 10:08 edzer

Digging into this I got the feeling that something happens inside wk:

library(sf)
# Linking to GEOS 3.11.1, GDAL 3.6.4, PROJ 9.1.1; sf_use_s2() is TRUE

vec <- st_read("vec.gpkg")
# Reading layer `vec' from data source `/home/edzer/Downloads/test/vec.gpkg' using driver `GPKG'
# Simple feature collection with 2 features and 0 fields
# Geometry type: MULTIPOLYGON
# Dimension:     XY
# Bounding box:  xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS:  WGS 84

geom = st_geometry(vec)
geom
# Geometry set for 2 features 
# Geometry type: MULTIPOLYGON
# Dimension:     XY
# Bounding box:  xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS:  WGS 84
# MULTIPOLYGON (((39.09903 46.76268, 39.0974 46.7...
# MULTIPOLYGON (((39.08823 46.75861, 39.08754 46....
geom |> st_as_binary() |> st_as_sfc(crs = st_crs(geom)) -> geom2
identical(geom, geom2)
# [1] TRUE

geom |> st_as_binary() |> s2::as_s2_geography()
# <geodesic s2_geography[2] with CRS=OGC:CRS84>
# [1] POLYGON ((39.0980983 46.7645449, 39.0967768 46.7658228, 39.0918796 46.7666747, 39.089483 46.763648, 39.089112 46.763434...  
# [2] POLYGON ((39.0891125 46.7634337, 39.088777 46.7632976, 39.0884805 46.7631988, 39.0882671 46.763073, 39.0880037 46.7629202...
geom |> st_as_binary() |> s2::as_s2_geography() |> st_as_sfc(crs=st_crs(geom))
# Geometry set for 2 features 
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 39.07532 ymin: 46.75672 xmax: 39.09903 ymax: 46.76667
# Geodetic CRS:  WGS 84
# POLYGON ((39.0981 46.76454, 39.09678 46.76582, ...
# POLYGON ((39.08911 46.76343, 39.08878 46.7633, ...

@paleolimbot could you take a look? The problem starts with the first longitude of the first polygon node.

edzer avatar Aug 15 '23 10:08 edzer

@edzer I think you are observing reordering that occurs because oriented = FALSE by default?

I wonder if the default snap precision (7 decimal places) is causing the move here:

sf:::st_make_valid.sfc
#> function (x, ..., oriented = FALSE, s2_options = s2::s2_options(snap = s2::s2_snap_precision(1e+07), 
#>     ...), geos_method = "valid_structure", geos_keep_collapsed = TRUE) 
#> ...
#> <bytecode: 0x10574ce28>
#> <environment: namespace:sf>

If I missed the mark on both of these, perhaps one of you could modify the following reprex and I'll run it through the debugger? There are some mildly complex things that happen to support tesselating planar edges into spherical ones and it's not impossible that something about these points triggers something I or the s2geometry authors didn't consider.

lng <- c(39.0990310770001, 39.0973986580001, 39.0900920320001, 39.0990310770001)
lat <- c(46.7626812080001, 46.7577288260001, 46.7584910210001, 46.7626812080001)
xy <- wk::xy(lng, lat, crs = wk::wk_crs_longlat())
poly <- wk::wk_polygon(xy)
(sfc <- sf::st_as_sfc(poly))
#> Geometry set for 1 feature 
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 39.09009 ymin: 46.75773 xmax: 39.09903 ymax: 46.76268
#> Geodetic CRS:  WGS 84
#> POLYGON ((39.09903 46.76268, 39.0974 46.75773, ...
# Using oriented = TRUE to avoid reordering
(geog <- s2::as_s2_geography(sfc, oriented = TRUE))
#> <geodesic s2_geography[1] with CRS=OGC:CRS84>
#> [1] POLYGON ((39.0990311 46.7626812, 39.0973987 46.7577288, 39.090092 46.758491, 39.0990311 46.7626812))
(sfc2 <- sf::st_as_sfc(geog))
#> Geometry set for 1 feature 
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 39.09009 ymin: 46.75773 xmax: 39.09903 ymax: 46.76268
#> Geodetic CRS:  WGS 84
#> POLYGON ((39.09903 46.76268, 39.0974 46.75773, ...

waldo::compare(
  sfc |> sf::st_set_crs(NA) |> wk::wk_vertices() |> wk::as_wkt(),
  sfc2 |> sf::st_set_crs(NA) |> wk::wk_vertices() |> wk::as_wkt()
)
#> unclass(old) vs unclass(new)
#> - "POINT (39.0990310770001 46.7626812080001)"
#> + "POINT (39.0990310770001 46.76268120800011)"
#>   "POINT (39.0973986580001 46.7577288260001)"
#>   "POINT (39.0900920320001 46.7584910210001)"
#> - "POINT (39.0990310770001 46.7626812080001)"
#> + "POINT (39.0990310770001 46.76268120800011)"

Created on 2023-08-15 with reprex v2.0.2

paleolimbot avatar Aug 15 '23 13:08 paleolimbot

I wonder if the default snap precision (7 decimal places) is causing the move here:

Indeed, increasing the precision helps.

st_make_valid(st_as_sfc(vec), s2_options = s2::s2_options(snap = s2::s2_snap_precision(1e+09)))

makes the lines coincide (but the coordinates are slightly different).

BTW: Setting the precision too high (i.e. 1e+18) and then run st_make_valid() crashes RStudio.

kadyb avatar Aug 15 '23 18:08 kadyb