RoadDetections icon indicating copy to clipboard operation
RoadDetections copied to clipboard

6,416,947 invalid linestrings in _USA.tsv

Open jmealo opened this issue 1 year ago • 1 comments

Hello,

There's a large number of invalid linestrings in the _USA.tsv file that only contain a single point or are self-referencing.

Only about 88% of the rows in the US file are valid line strings.

  • _USA.tsv = 54,484,737
  • Valid geometries = 48,067,790
  • Invalid geometries (or points): 6,416,947 (according to @turf/boolean-valid)

ST_IsValid returns false for an additional: 211,991 rows on top of that

Additionally, providing the file as a WKT representation per line would be easier for many folks to load into their database and is much smaller on disk.

Thanks, Jeff

jmealo avatar May 14 '24 13:05 jmealo

This is a duplicate of #11

jmealo avatar May 14 '24 13:05 jmealo