mosaic
mosaic copied to clipboard
Function st_isvalid does not produce consistent results.
Describe the bug When using the st_isvalid function to validate polygon geometries, the results changed based on the order of the fields used in the select statement.
To Reproduce Sample Code: df = spark.sql(""" select field A, field B, polygon_wkt from db1.polygon_tbl """).filter(mos.st_isvalid("polygon_wkt")==False).alias('df') display(df)
In the above code, it produced no invalid polygons. When A & B were swapped, two polygons were deemed invalid.
Expected behavior I would expect the results to be the same regardless of the order of field A and B.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context We believe the two 'invalid' polygons were fixed as we ran further code tests and they behaved normally.
Hi @KafesCNH, When you run display(df), Spark will select 1000 rows to be displayed. If you run this again with a different select statement, then the 1000 selected rows might be different that the previous ones.
Could you please run a df.count() instead of display(df) to check if the issue was in the display?
@KafesCNH any chance you can help to re-create the issue? (unless you have it already fixed)