geocompr icon indicating copy to clipboard operation
geocompr copied to clipboard

Section on cleaning geometries in the geometry chapter

Open Robinlovelace opened this issue 2 years ago • 11 comments

Currently this is the only mention of cleaning geometries in the book I believe:

https://github.com/Robinlovelace/geocompr/blob/3579906af69949dbe47fec783b62ad530018ce14/10-gis.Rmd#L255-L263

As @defuneste has flagged this, and anecdotal evidence suggests it's a common issue, I suggest it will be a useful section. Thoughts on best tools for the job? Options include (we can check off which to test/mention):

  • [ ] st_make_valid()
  • [ ] geos_make_valid(), how does this differ from the {sf} version?
  • [ ] pprepr package: https://gitlab.com/dickoa/pprepr and https://twitter.com/dickoah/status/1262765292825927682
  • [ ] Tools in {spatstat.geom}, I defer to your knowledge on this @defuneste
  • [ ] Any others? I thought rmapshaper my provide something but seems not to

Interested in which works, people look to this for recommendations so if we cover stuff we should ensure it's tested and known to work! I've had so-so experience with st_make_valid() but it's in {sf} so should be covered first, then {sptatstat} tools as they are well maintained. pprepr is not on CRAN and seems to be unmaintained.

Robinlovelace avatar Jun 23 '22 21:06 Robinlovelace

I am just trying to use some stuff in {spastat} so I am slowly reading doc/book/codes. Apparently {polyclip} (https://github.com/baddstats/polyclip) is used to do some "cleaning". It is use here (https://github.com/spatstat/spatstat.geom/blob/d90441de5ce18aeab1767d11d4da3e3914e49bc7/R/window.R#L230-L240).

This is in the owin class and it is probably use to avoid self-intersecting polygon.

I will have to test it a bit to get a better understanding ...

defuneste avatar Jun 23 '22 22:06 defuneste

I have adapted this web page: http://s3.cleverelephant.ca/invalid.html with a bunch of topological errors (it is from @pramsey and related blog post: https://www.crunchydata.com/blog/waiting-for-postgis-3.2-st_makevalid).

The script is here: https://github.com/defuneste/utile_comme_du_pq/blob/master/erreur_topo.R it has a lot of dead codes and should be cleaned a bit soon. I could not understand/reproduce all the errors but I think it is a very nice setup to test some algorithm that "clean geometries". On the negative side it only include one or two geometries per error.

Stuff that can be improved (for later):

  • Try to organize errors in category, ie : polygon, ppolygon + hole(s), multipolygon
  • Display vertexes

defuneste avatar Aug 01 '22 18:08 defuneste

The twiiter post helped!

  • Ty frazier (@syntheticpops) mentioned also terra::makeValid() and a terminal approach with ogr2ogr — skipfailures x.shp y.shp
  • @mdsumner mentionned sfdct::ct_triangulate() followed with group_by this tweet is also very helpful to start understanding a bit more the various approach of this problem
  • Etienne Racine (@tiennebr) also bring the classic buffer at 0m that we should add to the list
  • New Geographer mention v.clean in grass that we already have in chapter 10

My shiny app start to look not too bad. I will add more options and see how I can host it somewhere so it can be accessible to other.

edit: few typos

defuneste avatar Oct 11 '22 13:10 defuneste

This is awesome @defuneste, keep the ideas coming. Hope to implement some of them in time for the 2nd edition!

Robinlovelace avatar Oct 11 '22 13:10 Robinlovelace

I have tested {prepr} (with one p I think!) and {polyclip} on the small shiny app here (https://github.com/defuneste/utile_comme_du_pq/tree/master/topo_errors). We get very different results depending of the errors, algorithms/implementation. Even if it is not perfect (we could add some function args in the shiny apps), I will try to figure a way later to publish it. it will probably take me too much time to host it but before I can use the free shiny hosting. What do you think?

How deep do you want to go in geocompr?

I think the minimum should include the two functions from {terra} and {sf} and the classic "hack" of st_buffer(x, 0). Polyclip is probably the least interesting even if it is quick intuitive to understand how it works.

I will need to read the paper on "constrained triangulation" to understand {prepr} but result look goods.

Next should be for me to read a bit more on how terra::makeValid() and sf::st_make_valid works().

defuneste avatar Oct 12 '22 16:10 defuneste

Look forward to giving this a spin, over the weekend maybe :pray:

Robinlovelace avatar Oct 12 '22 20:10 Robinlovelace

Well I am hosting it! : https://www.branchtwigleaf.com/shinyapps/make-valid-geom/

if it useful I would totally move it to some geocompsomething because I think the value is mostly pedagogical

What I have learn from it:

  • I was surprised at the diversity of results in some cases

  • even if {terra} and {sf} both use geos they sometimes provide slightly different results, my guess is different choices of implementation. I have no idea which is correct (if one is) but we, the geo communities, should find a way of explaining it.

  • st_buffer(geom, 0) is great but sometimes produces weird result with multi polygon or polygon with holes

  • polyclip should probably not be used outside of case were you need a windows (kind of similar problem than a buffer because polyclip is a clipping tool with a bigger polygon). Troubles could also come from my implementation as you has a lot of format conversions (sf -> polyclip -> sf)

  • st_repair not much to say, it seems good, it is a bit hard on the dependencies sides so not for a basic user. I have still not read the paper

Edit: updating the link!

defuneste avatar Oct 19 '22 20:10 defuneste

GRASS documentation about V.clean is great and I should think of a way to add it : https://grass.osgeo.org/grass82/manuals/v.clean.html

defuneste avatar Oct 19 '22 21:10 defuneste

Probably you want the "structure" option for the make valid parameters. That should give a result that is "much like buffer(0)" without the failure modes.

pramsey avatar Oct 19 '22 21:10 pramsey

Hi @Robinlovelace do we have dead line on this?

I will probably need some time to understand a bit more GRASS before adding it. It can be mention in chapter two (explaining the concept of validity maybe in the same place than inner ring / holes ?) or later in chapter 5 but I do not see where.

The link of @pramsey was a good help to understand the GEOS level (I will still have to try some cases and "draw" them). We still need to get how {terra}/{sf} use it. It is hard because not everyone will be at GEOS 3.10.

defuneste avatar Oct 27 '22 18:10 defuneste

Hey @defuneste, no hard deadline but sooner would be getter.

Robinlovelace avatar Oct 27 '22 20:10 Robinlovelace