datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Sort Merge Join. LeftAnti issues

Open comphead opened this issue 9 months ago • 1 comments

Describe the bug

SMJ LeftAnti join returns the wrong result

To Reproduce

#[tokio::test]
    async fn test_anti() -> Result<()> {
        let ctx: SessionContext = SessionContext::new();

        let sql = "set datafusion.optimizer.prefer_hash_join = false;";
        let _ = ctx.sql(sql).await?.collect().await?;

        println!("=============================================================================");
        let sql = "set datafusion.execution.batch_size = 1";
        let _ = ctx.sql(sql).await?.collect().await?;

        let sql = "
            select * from (
            with
            t1 as (
                select 11 a, 12 b),
            t2 as (
                select 11 a, 13 b union all
                select 11 a, 14 b
                )
            select t1.* from t1 where not exists (select 1 from t2 where t2.a = t1.a and t2.b < t1.b)
            ) order by 1, 2;
        ";

        let actual = ctx.sql(sql).await?.collect().await?;
        let expected: Vec<&str> = vec![
            "+----+----+",
            "| a  | b  |",
            "+----+----+",
            "| 11 | 12 |",
            "+----+----+",
        ];
        datafusion_common::assert_batches_eq!(expected, &actual);

        Ok(())
    }

Expected behavior

Test should pass

Additional context

No response

comphead avatar May 04 '24 16:05 comphead

Check the LeftAnti issue reported in HJ https://github.com/apache/datafusion/issues/10583 to be handled also in SMJ

comphead avatar May 22 '24 15:05 comphead

@comphead can I pick this or are you arleady working on a solution?

edmondop avatar May 24 '24 20:05 edmondop

Thanks @edmondop I'm already working on it, if you have spare time and would like to work on SMJ its really needed to add fuzz tests for SMJ when the filter set, I'll create a ticket

Filed #10659

comphead avatar May 24 '24 21:05 comphead

Thanks. Will work on #10659 !

edmondop avatar May 24 '24 22:05 edmondop