datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

PushDownFilter optimizer pushes down filters through distinct on

Open epsio-banay opened this issue 1 year ago • 0 comments

Describe the bug

a filter should not be pushed down through a distinct on because it will change the results. The PushDownFilter optimizer rule does not behave correctly.

To Reproduce

Add this test to push_down_filter.rs:

#[test]
    fn distinct_on() -> Result<()> {
        let table_scan = test_table_scan()?;
        let plan = LogicalPlanBuilder::from(table_scan)
            .distinct_on(vec![col("a")], vec![col("a")], None)?
            .filter(col("a").eq(lit(1i64)))?
            .build()?;
        // filter appears below Union
        let expected = "\
        Filter: a = Int64(1)\
        \n  DistinctOn: on_expr=[[test.a]], select_expr=[[a]], sort_expr=[[]]\
        \n    TableScan: test";
        assert_optimized_plan_eq(plan, expected)
    }

you will get the logical plan:

DistinctOn: on_expr=[[test.a]], select_expr=[[a]], sort_expr=[[]]
  TableScan: test, full_filters=[a = Int64(1)]

Expected behavior

Filters should not be pushed down through distinct on. the logical plan should be as expected in the above test.

Additional context

No response

epsio-banay avatar Oct 15 '24 14:10 epsio-banay