terra
terra copied to clipboard
Calculating buffer crashes R session in edge cases on Windows only
I have encountered some issues with certain polygons when calculating unions or buffers. This is hard to reproduce and I have not identified the underlying pattern in the problem objects. One such real-life example is attached. Running the code below reliably crashes the R session for me under Windows, but not in MacOS or Linux.
I would love to know what is going on to avoid these catastrophic uncaught errors that kill any session they occur in.
crds = as.matrix(read.csv("ProblemPolygon.csv"), row.names = 1)
poly = terra::vect(crds[ , -1], type = "polygons", crs = "local")
terra::buffer(poly, 0.05) # Aborts R session in Windows 10 (terra v. 1.7.55)
Could this be related to https://github.com/rspatial/terra/issues/1331 ?
BTW: Shouldn't crds
be crds[, 2:3]
to omit column with ID? Nevertheless, the crash still occurs on Windows.
BTW: Shouldn't
crds
becrds[, 2:3]
to omit column with ID? Nevertheless, the crash still occurs on Windows.
Yep - sorry. I had 'row.names = 1' in my test version and managed to copy/paste the wrong text :D The OP has been edited.
This works for a similar case
library(terra)
pnt <- vect(cbind(0,0), crs="local")
pol <- buffer(pnt, 0.12, quadsegs=90)
z <- terra::buffer(pol, 0.05)
And seems to be related to precision or distance between nodes. Your data
library(terra)
crds <- as.matrix(read.csv("https://github.com/rspatial/terra/files/13552045/ProblemPolygon.csv"))
poly <- terra::vect(crds[,2:3], type = "polygons", crs = "local")
Works with sf
s <- sf::st_as_sf(poly)
a <- sf::st_buffer(s, 0.05, nQuadSegs=10)
Whereas with terra you need to use a high number of quadsegs
b <- terra::buffer(poly, 0.05, quadsegs=50)
## crash:
#terra::buffer(poly, 0.05, quadsegs=10)
I do not understand the difference in behavior.
Interesting. I had tried to identify possible problem intervals (with undefined tangents perhaps - I don't know the details of the downstream C code) but I cannot yet see any qualitative difference between node constellations that crash and those that are OK. If there were a way to predict the crash, then I could at least alter problem cases...
Also, all examples work fine on MacOS and Linux.
This also works for me on windows with GEOS 3.9.3 and on Ubuntu with 3.10.2. It fails on windows with
gdal(lib="geos")
[1] "3.11.2"
So I am thinking that it is related to the version of GEOS, not to the OS; but I am not sure. Perhaps @dbaston has an idea about what can be going on?
How does terra pick up the GEOS version? It clearly does not work through the R geos package as geos::geos_version()
gives different version info so must be a parallel installation.
How would one test out different GEOS versions? (I am not a Windows user so am having no success building from source on this platform).
I have GEOS 3.11.0 on MacOS that works fine.
To get the version of GEOS that is used by terra:
terra::gdal(lib="geos")
#[1] "3.11.2"
On windows, I use whichever version comes with RTools. Not trivial to change that, I think. One could use an older version of R
I tried rolling back to Rtools 4.2, but terra was still reporting the same GEOS version (3.11.2; using the method you mentioned). My understanding was that Rtools42 should provide GEOS 3.9.3. There is also a newer GEOS version (3.12.1) available, which I downloded and built, but I am obviously not linking that to be visible to terra correctly. I have to admit that my knowledge of Windows dev (=0) and the intricacies of the terra/GEOS ecosystem (<<1) mean I cannot really contribute much more to the process. Unfortunately though, this is a critical issue that currently makes my dependent package unuseable for many Windows users.
Am I doing something incorrect in reading these inputs with terra? I am seeing 361 single-point polygons:
> geom(poly, wkt=TRUE)
[1] "POLYGON ((-0.118 -0.828))" "POLYGON ((-0.118 -0.827))" "POLYGON ((-0.118 -0.8259))"
@dbaston thank you for having a look. The first column of crds
should be ignored or set to a constant to avoid that it serves as a geometry ID. For example
crds <- as.matrix(read.csv("https://github.com/rspatial/terra/files/13552045/ProblemPolygon.csv"))
poly <- terra::vect(crds[,2:3], type = "polygons", crs = "local")
I have added a R and C++ terra:::.buffer2
method that is the same as buffer
but with no options or other fluff. It still crashes. It may help in debugging this. Perhaps the biggest mystery is why this works in sf. The creation of GEOS geoms is different there, but that does not create issues with other methods. sf allows setting precision, but in this case the problem seems to be that there is insufficient precision, not too much; and either way the default should be the same.
Also of note is that reducing the number of nodes in the offending polygon fixes the problem:
library(terra)
f <- "https://github.com/rspatial/terra/files/13552045/ProblemPolygon.csv"
crds1 <- as.matrix(read.csv(f))[,2:3]
crds2 <- crds1[seq(1, nrow(crds1), 2), ]
pol1 <- terra::vect(crds1, type = "polygons", crs = "local")
pol2 <- terra::vect(crds2, type = "polygons", crs = "local")
buffer(pol2, 0.05, quadsegs=10)
# class : SpatVector
# geometry : polygons
# dimensions : 1, 0 (geometries, attributes)
# extent : -0.288, -0.068, -0.938, -0.718 (xmin, xmax, ymin, ymax)
# coord. ref. : Cartesian (Meter)
buffer(pol2, 0.05, quadsegs=1) |> crds() |> nrow()
#[1] 324
#crash
#buffer(pol1, 0.05, quadsegs=10)
Can't reproduce on Ubuntu with 3.11.2, nothing shows up in valgrind. I can check on Windows at some point but don't have access to a machine at the moment.
Adding a unit test with this geometry to the GEOS test suite in 3.11.2 doesn't appear to cause a problem on the Windows toolchains used there: https://github.com/dbaston/libgeos/actions/runs/7159847765
The below suggests that the binary representation, as expected, is the same in "sf" and "terra"
hexsf <- sf::st_as_sf(pol1) |> sf::st_geometry() |> sf::st_as_binary() |> sf::rawToHex()
hexterra <- geom(pol1, hex=TRUE)
toupper(hexsf) == hexterra
Just FYI, I am planning to push a workaround using the increased 'quadsegs' value (setting this to 90 prevents crashes on at least all of my tests so far). Unclear though whether geometries could still turn up for which this is not sufficient - but at least I can get my package (https://cran.r-project.org/package=Rtrack) back running for Windows users. I'll keep following this issue and build in a more permanent solution once its clear what's going on.
I have added exactly the same C++ GEOS buffer function to my fork of "sf" and to the test branch of "terra". Here is the code in terra and here in sf.
I can now do
library(terra)
f <- "https://github.com/rspatial/terra/files/13552045/ProblemPolygon.csv"
crds1 <- as.matrix(read.csv(f))[,2:3]
crds2 <- crds1[seq(1, nrow(crds1), 2), ]
pol1 <- terra::vect(crds1, type = "polygons", crs = "local")
pol2 <- terra::vect(crds2, type = "polygons", crs = "local")
wkt2 <- terra::geom(pol2, wkt=TRUE)
a = sf:::.sfwktbuffer(wkt2, 0.05)
#geos cpp sf
b = terra:::.terrawktbuffer(wkt2, 0.05)
#geos cpp terra
wkt1 <- terra::geom(pol1, wkt=TRUE)
a = sf:::.sfwktbuffer(wkt1, 0.05)
#geos cpp sf
## crashes
## b = terra:::.terrawktbuffer(wkt1, 0.05)
So exactly the same code works with sf, but not with terra. On Windows with R 4.3.2, Rtools 4.3 and GEOS 3.11.2 in both cases.
Does GEOS have some global environment variable that I do not set, or do not set correctly? Other ideas??
As a temporary workaround, projecting the vector to a different CRS (I used Mollweide), buffering, then back-projecting can work.