sf icon indicating copy to clipboard operation
sf copied to clipboard

sf::st_lenght() seems to give incorrect results when using S2 engine

Open jlacko opened this issue 1 year ago • 9 comments

The sf::st_length() seems to give incorrect result when used to calculate length of shared border between two countries.

This issue flows from an earlier SO question about calculating length of the border between Germany and Denmark.

The result obtained by calculating length of intersection of two polygons in WGS84 is off significantly (by a factor of 2). Reprojecting the object to ETRS89 gives a more reasonable length, as does running the calculation in WGS84 but with S2 turned off via sf::sf_use_s2(FALSE).

For a reproducible example consider this piece of code, noting that the "reference" length of the border is 68 kilometers and that doing the calculation on a lower resolution object is expected to give a somewhat lower figure.

library(sf)
library(dplyr)

border_length <- function(shape) {
  
  sf::st_intersection(shape[1], shape[2], model = "closed") %>%
    st_length()
    
}

gisco_wgs <- giscoR::gisco_get_countries(resolution = "20",
                                epsg = 4326,
                                country = c("DE", "DK")) %>% 
  st_geometry() # to avoid warnings about spatially constant variables

gisco_3035 <- st_transform(gisco_wgs, 3035)

border_length(gisco_wgs)
# 105547.6 [m] - this is too much!

border_length(gisco_3035)
# 52953.44 [m] - this feels about right, given the resolution...

sf_use_s2(FALSE)

border_length(gisco_wgs)
# although coordinates are longitude/latitude, st_intersection assumes that they are planar
# 52940.58 [m] - this again feels right, but note the difference compared to results with s2 turned on!

jlacko avatar Sep 12 '22 10:09 jlacko

@paleolimbot could you take a look at this? Is something maybe recycled once too often?

edzer avatar Sep 12 '22 11:09 edzer

It looks like it's a problem with the edge direction...s2 apparently cares that the edges defined by the borders between denmark and germany are in opposite directions. I don't remember if/how we can pass the edge_direction into the S2 intersection op from sf, but if we do it straight in S2 it looks like we get a better answer:

library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE

gisco_wgs <- giscoR::gisco_get_countries(resolution = "20",
                                         epsg = 4326,
                                         country = c("DE", "DK")) %>% 
  st_geometry() # to avoid warnings about spatially constant variables

gisco_3035 <- st_transform(gisco_wgs, 3035)

gisco_wgs_feat <- s2::s2_intersection(
  gisco_wgs[1], 
  gisco_wgs[2],
  s2::s2_options(
    model = "closed",
    edge_type = "undirected"
  )
)

s2::s2_length(gisco_wgs_feat)
#> [1] 52773.79

gisco_3035_feat <- geos::geos_intersection(gisco_3035[1], gisco_3035[2])
geos::geos_length(gisco_3035_feat)
#> [1] 52953.44

Created on 2022-09-12 by the reprex package (v2.0.1)

paleolimbot avatar Sep 12 '22 12:09 paleolimbot

Thanks! Should we make edge_type = "undirected" the default, for all transformers?

New & old values:

library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE
library(dplyr)

# Attaching package: ‘dplyr’

# The following objects are masked from ‘package:stats’:

#     filter, lag

# The following objects are masked from ‘package:base’:

#     intersect, setdiff, setequal, union


border_length <- function(shape) {
  
  sf::st_intersection(shape[1], shape[2], model = "closed", edge_type = "undirected") %>%
    st_length()
    
}

gisco_wgs <- giscoR::gisco_get_countries(resolution = "20",
                                epsg = 4326,
                                country = c("DE", "DK")) %>% 
  st_geometry() # to avoid warnings about spatially constant variables

gisco_3035 <- st_transform(gisco_wgs, 3035)

border_length(gisco_wgs)
# 52773.79 [m]
# 105547.6 [m] - this is too much!

border_length(gisco_3035)
# 52953.44 [m]
# 52953.44 [m] - this feels about right, given the resolution...

sf_use_s2(FALSE)
# Spherical geometry (s2) switched off

border_length(gisco_wgs)
# although coordinates are longitude/latitude, st_intersection assumes that they are planar
# 52940.58 [m]
# although coordinates are longitude/latitude, st_intersection assumes that they are planar
# 52940.58 [m] - this again feels right, but note the difference compared to results with s2 turned on!

Should we make edge_type = "undirected" the default, for all transformers?

edzer avatar Sep 12 '22 12:09 edzer

I see that I have made in principle the same reply a few seconds late... I would be happy to propose a pull request with edge_type = "undirected" for st_intersection, with a little elaboration on how it can be overridden in docs, if agreeable to the maintainers.

Thanks! Should we make edge_type = "undirected" the default, for all transformers?

New & old values:

library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.0; sf_use_s2() is TRUE
library(dplyr)

# Attaching package: ‘dplyr’

# The following objects are masked from ‘package:stats’:

#     filter, lag

# The following objects are masked from ‘package:base’:

#     intersect, setdiff, setequal, union


border_length <- function(shape) {
  
  sf::st_intersection(shape[1], shape[2], model = "closed", edge_type = "undirected") %>%
    st_length()
    
}

gisco_wgs <- giscoR::gisco_get_countries(resolution = "20",
                                epsg = 4326,
                                country = c("DE", "DK")) %>% 
  st_geometry() # to avoid warnings about spatially constant variables

gisco_3035 <- st_transform(gisco_wgs, 3035)

border_length(gisco_wgs)
# 52773.79 [m]
# 105547.6 [m] - this is too much!

border_length(gisco_3035)
# 52953.44 [m]
# 52953.44 [m] - this feels about right, given the resolution...

sf_use_s2(FALSE)
# Spherical geometry (s2) switched off

border_length(gisco_wgs)
# although coordinates are longitude/latitude, st_intersection assumes that they are planar
# 52940.58 [m]
# although coordinates are longitude/latitude, st_intersection assumes that they are planar
# 52940.58 [m] - this again feels right, but note the difference compared to results with s2 turned on!

Should we make edge_type = "undirected" the default, for all transformers?

jlacko avatar Sep 12 '22 12:09 jlacko

I think the only thing you have to be careful of is that polygon--polygon intersections still work the way you think they're going to. (I think that in practice that will only matter for polygons bigger than half the earth).

paleolimbot avatar Sep 12 '22 13:09 paleolimbot