datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

feat: continue visit children expr of short-circuit expr

Open zhuliquan opened this issue 2 months ago • 1 comments

Which issue does this PR close?

Closes #.

Rationale for this change

Recently. I find that some common subexpr doesn't be eliminated via CommonSubexprEliminate optimizer. for example below logical_plan don't reuse expr rest.a + test.b

Filter: test.a + test.b = Int32(2) OR test.a + test.b = Int32(3) OR test.c > Int32(4)
  Projection: test.a, test.b, test.c
    TableScan: test

I notice that test.a + test.b is child expr of short-circuit expr test.a + test.b = Int32(2) OR test.a + test.b = Int32(3). according to new datafusion code in common_subexpr_eliminate.rs: https://github.com/apache/datafusion/blob/1155b0b15e6ce3a8d5d28e5ecaebf4706448c548/datafusion/optimizer/src/common_subexpr_eliminate.rs#L856-L859 optimizer will jump short-circuit expr and don't visit children expr recursively, that it will lose the possibility of discovering that children expressions are common. I think it should be mainly to check whether it needs to count its expression rather than to avoid exploring child expressions recursively. so, I think need to remove short-circuit expr checking from func f_down and add checking in below code on func f_up:

https://github.com/apache/datafusion/blob/1155b0b15e6ce3a8d5d28e5ecaebf4706448c548/datafusion/optimizer/src/common_subexpr_eliminate.rs#L878-L882

After modification. we can't get new optimized plan:

Projection: test.a, test.b, test.c
  Filter: __common_expr_1 = Int32(2) OR __common_expr_1 = Int32(3) OR test.c > Int32(4)
    Projection: test.a + test.b AS __common_expr_1, test.a, test.b, test.c
      Projection: test.a, test.b, test.c
        TableScan: test

What changes are included in this PR?

allow continue visit children expr of short-circuit expr and avoid count short-circuit expr itself.

Are these changes tested?

yes, and I add test case test_filter_expressions for reused child expr of short-circuit expr.

Are there any user-facing changes?

no.

zhuliquan avatar Jun 20 '24 16:06 zhuliquan