gdal_polygonize may generate invalid polygons according to ST_IsValid()
Cf "POLYGON ((1 4,1 3,0 3,0 1,1 1,1 0,3 0,3 1,4 1,4 3,3 3,3 4,1 4),(1 3,3 3,3 1,1 1,1 3))" from https://github.com/OSGeo/gdal/pull/7344/commits/82dfa7ad434f7548362f7d3b8629ae05cdf64b6b. This may be specific to 8-connectedness. The issue is not specific to PR #7344
Actually this is likely related only to 8-connectedness, and this is a bit by design. It is impossible to have a single Polygon that uses 8-connectnedness that is valid according to ST_IsValid(). Either we should just document that, or possibly add an option to generate valid MultiPolygon
Maybe we should not output invalid polygons, or at least give users an easy option for making them valid. Possible solutions:
- Add an option
--make-validto the polygonize program https://gdal.org/en/stable/programs/gdal_raster_polygonize.html - Document how to make the result valid with a pipeline (needs developing/testing so that the pipeline really works)
gdal raster pipeline --progress ! read input.tif ! polygonize -c ! make-valid ! 8conn_valid.json
gdal raster polygonize --of GeoJSON -c -i P3412A.tif -o 8conn.json --progress
0...10...20...30...40...50...60...70...80...90...100 - done in 00:01:05.
ST_IsValid finds 27390 invalid features from the output file. Errors can be corrected with "make-valid"
gdal vector make-valid --of GeoJSON -i 8conn.json -o 8conn-valid.json --progress
0...10...20...30...40...50...60...70...80...90...100 - done in 00:01:36.
For some reason this pipeline command begins but then just burns the CPU at 0..
gdal pipeline --progress ! read P3412A.tif ! polygonize -c ! write 8conn.json
EDIT: It is slow because the pipeline writes a temporary GeoPackage. Perhaps polygons are inserted one-by-one each as a new transaction?
CC @dbaston