datafusion-comet
datafusion-comet copied to clipboard
Comparison between negative zero and false produces incorrect result
Describe the bug
SQL
SELECT c30, c98, c30 = c98 FROM test0 ORDER BY c30, c98;
Spark Plan
AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
*(2) Sort [c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST], true, 0
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 0
+- Exchange rangepartitioning(c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [plan_id=16892]
+- *(1) Project [c30#30, c98#98, (c30#30 = cast(c98#98 as float)) AS (c30 = c98)#15740]
+- *(1) ColumnarToRow
+- FileScan parquet [c30#30,c98#98] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/home/andy/git/apache/datafusion-comet/fuzz-testing/test0.parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c30:float,c98:boolean>
+- == Initial Plan ==
Sort [c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [plan_id=16878]
+- Project [c30#30, c98#98, (c30#30 = cast(c98#98 as float)) AS (c30 = c98)#15740]
+- FileScan parquet [c30#30,c98#98] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/home/andy/git/apache/datafusion-comet/fuzz-testing/test0.parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c30:float,c98:boolean>
Comet Plan
AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
*(2) Sort [c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST], true, 0
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 0
+- Exchange rangepartitioning(c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [plan_id=16957]
+- *(1) ColumnarToRow
+- CometProject [c30#30, c98#98, (c30 = c98)#15748], [c30#30, c98#98, (c30#30 = cast(c98#98 as float)) AS (c30 = c98)#15748]
+- CometScan parquet [c30#30,c98#98] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/home/andy/git/apache/datafusion-comet/fuzz-testing/test0.parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c30:float,c98:boolean>
+- == Initial Plan ==
Sort [c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(c30#30 ASC NULLS FIRST, c98#98 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [plan_id=16937]
+- CometProject [c30#30, c98#98, (c30 = c98)#15748], [c30#30, c98#98, (c30#30 = cast(c98#98 as float)) AS (c30 = c98)#15748]
+- CometScan parquet [c30#30,c98#98] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/home/andy/git/apache/datafusion-comet/fuzz-testing/test0.parquet], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<c30:float,c98:boolean>
First difference at row 23:
Spark: -0.0,false,true
Comet: -0.0,false,false
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
We should test with non-negative zero as well
We should fix in the upstream https://github.com/apache/datafusion/issues/11108