mosaic icon indicating copy to clipboard operation
mosaic copied to clipboard

Function st_isvalid does not produce consistent results.

Open KafesCNH opened this issue 2 years ago • 2 comments

Describe the bug When using the st_isvalid function to validate polygon geometries, the results changed based on the order of the fields used in the select statement.

To Reproduce Sample Code: df = spark.sql(""" select field A, field B, polygon_wkt from db1.polygon_tbl """).filter(mos.st_isvalid("polygon_wkt")==False).alias('df') display(df)

In the above code, it produced no invalid polygons. When A & B were swapped, two polygons were deemed invalid.

Expected behavior I would expect the results to be the same regardless of the order of field A and B.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context We believe the two 'invalid' polygons were fixed as we ran further code tests and they behaved normally.

KafesCNH avatar Dec 16 '22 14:12 KafesCNH

Hi @KafesCNH, When you run display(df), Spark will select 1000 rows to be displayed. If you run this again with a different select statement, then the 1000 selected rows might be different that the previous ones.

Could you please run a df.count() instead of display(df) to check if the issue was in the display?

edurdevic avatar Jan 09 '23 10:01 edurdevic

@KafesCNH any chance you can help to re-create the issue? (unless you have it already fixed)

r3stl355 avatar Apr 13 '23 18:04 r3stl355