Attempt to fix invalid way geometries
Overview
geom.buffer(0) can often fix invalid geometries, e.g. self-crossing polygons. This increases the number of way geometries produced from certain datasets.
Checklist
- [x] Add entry to CHANGELOG.md
This is a potentially costly operation that may not be desired by the user. Should this be triggered with a flag (that defaults to false)? Alternatively, if invalid ways are being discarded (not sure if they are), is the right thing to leave them in the dataframe? Not all operations care about validity, and one can process the geometries like this explicitly, can they not? We could provide a library method in OSM to do what you've proposed.
'Tis costly indeed, but it's only attempted on geometries that are found to be invalid.
Prior to this change, invalid way geometries produced null geometries (but retained the rest of the metadata). I think that's wrong either way.
My inclination is make it optional but default to fixing geometries. It feels more useful to be handed a dataframe of geometries that are as valid as possible rather than put the onus of fixing invalid ones on the user. I.e. the expectation should be that things are good to go.
There's no guarantee that way geometries are 1:1 with the OSM element that they're derived from (there might be cases where the invalid geometry is helpful, but I'm struggling to come up with one).
Fair enough.
Tangentially, today I've been encountering empty geometries which are causing any number of headaches. I can't give an example of one because I just want to finish this job and get on with my life, but they are out there. (Encountered these during processing of the most recent planet snapshot in s3://osm-pds/.)