geos icon indicating copy to clipboard operation
geos copied to clipboard

`geos` future

Open atsyplenkov opened this issue 3 months ago • 2 comments

What are your thoughts on the future of the package?

I really like it and use it almost daily in my research. However, I find it unfair that the package remains rather unpopular. Perhaps that is because of the lack of documentation, as you previously mentioned in #60. TBH if it were not for your blogposts on fish&whistle, I doubt I'd have understood how to work with the package. That being said, I am happy to help with documenting functions and creating some examples. An FAQ might also be helpful to cover basic things like coordinate transformation (e.g., #81)

Another thought I had is that some of the basic geospatial operations are not very obvious to do with geos (like #98). It is still possible and it works very fast, but it requires some extra work. For example, removing holes in polygons:

library(terra)
library(geos)
library(bench)

geos_remove_holes <-
  function(geom, min_area = NULL) {
    if (is.null(min_area)) {
      return(get_parent(geom))
    }

    nring <- geos::geos_num_rings(geom)
    ring_areas <- vapply(
      seq_len(nring),
      \(j) get_ring_area(geom, j),
      FUN.VALUE = numeric(1)
    )
    areas_bool <- which(ring_areas > min_area)
    parent <- get_parent(geom)
    child <- geos::geos_ring_n(geom, areas_bool[-1]) |>
      geos::geos_make_collection() |>
      geos::geos_polygonize()

    geos::geos_difference(parent, child)
  }

get_parent <-
  function(geom, ...) {
    geos::geos_ring_n(geom, 1) |>
      geos::geos_polygonize() |>
      geos::geos_unnest(...)
  }

get_ring_area <-
  function(geom, i) {
    geom |>
      geos::geos_ring_n(i) |>
      geos::geos_polygonize() |>
      geos::geos_area()
  }

wk_string <-
  paste(
    "POLYGON (",
    "(2000000 6000000, 2001000 6000000, 2001000 6001000, 2000000 6001000, 2000000 6000000),",
    "(2000200 6000200, 2000260 6000200, 2000260 6000260, 2000200 6000260, 2000200 6000200),",
    "(2000300 6000300, 2000310 6000300, 2000310 6000310, 2000300 6000310, 2000300 6000300)",
    ")"
  )
poly <- terra::vect(wk_string, crs = "epsg:2193")
poly_geos <- geos::as_geos_geometry(poly)

bench::mark(
  terra = terra::fillHoles(poly),
  geos = geos_remove_holes(poly_geos),
  check = FALSE,
  iterations = 100L
)
#># A tibble: 2 × 13
#>   expression      min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
#>   <bch:expr> <bch:tm> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
#> 1 terra       293.5µs  331µs     2502.     1.6KB        0   100     0       40ms
#> 2 geos         92.5µs  100µs     9745.        0B        0   100     0     10.3ms


par(mfrow = c(1, 3))
plot(poly_geos, col = "grey", lwd = 2)
plot(geos_remove_holes(poly_geos), col = "grey", lwd = 2)
plot(geos_remove_holes(poly_geos, min_area = 100), col = "grey", lwd = 2)

Image

Over the years of using geos, I have collected several such snippets. At first, I was thinking about creating a separate package, but now I thought that perhaps they would be better suited here. What do you think? Should such functions be added?

atsyplenkov avatar Sep 19 '25 02:09 atsyplenkov

Very cool! Those are very cool functions that definitely deserve to be in an R package and made available to users!

From this package's perspective, staying close to the GEOS API is probably best. I haven't done a great job keeping up with the GEOS API changes as it is and it's a nicely well-scoped problem that doesn't require a lot of API decisions (just do what GEOS did!).

A good reason to put something here rather than another package would be if there is some combination of C function calls that needed to happen in a loop (although even then, the libgeos/geos design would let you write those safely in another package too).

That said, you just put together an excellent PR and this package could use some love. If adding those functions is something that gets you involved, there's really no harm to adding them.

paleolimbot avatar Sep 19 '25 14:09 paleolimbot

Thank you for your reply. The roadmap is clear.

I am happy to contribute to the package as much as I can. My thinking is that it would be better to start by improving the documentation and examples. Then, we can probably proceed with updating GEOS to 3.14. According to their news, there were several new useful functions (for example, clustering!!!) for which it would be very nice to have bindings. Unfortunately, I am not very skilled in C++, so there will be a learning curve.

If you are happy with this plan, please expect PRs from me in the coming weeks and months.

atsyplenkov avatar Sep 20 '25 04:09 atsyplenkov