vortex icon indicating copy to clipboard operation
vortex copied to clipboard

FilterExpr to be rewritten to remove nulls as early as possible

Open gatesn opened this issue 8 months ago • 2 comments

Filter expressions in the scan convert nulls to false in Mask::try_from(&dyn Array), the earlier we can remove nulls, the more performant these expressions are to compute instead of pointlessly respecting nulls all the way to the root of the expr.

We will need is_null and is_not_null expressions to implement this.

@robert3005 says Spark performs a similar transformation, if you could track it down that would be helpful!

gatesn avatar Apr 15 '25 15:04 gatesn

Here's the logic to infer the constraints https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/QueryPlanConstraints.scala#L113

robert3005 avatar Apr 15 '25 16:04 robert3005

Does anyone have an example of where this matters on workflows we currently use

joseph-isaacs avatar Apr 17 '25 14:04 joseph-isaacs