datafusion
datafusion copied to clipboard
Sort Merge Join. LeftAnti issues
Describe the bug
SMJ LeftAnti join returns the wrong result
To Reproduce
#[tokio::test]
async fn test_anti() -> Result<()> {
let ctx: SessionContext = SessionContext::new();
let sql = "set datafusion.optimizer.prefer_hash_join = false;";
let _ = ctx.sql(sql).await?.collect().await?;
println!("=============================================================================");
let sql = "set datafusion.execution.batch_size = 1";
let _ = ctx.sql(sql).await?.collect().await?;
let sql = "
select * from (
with
t1 as (
select 11 a, 12 b),
t2 as (
select 11 a, 13 b union all
select 11 a, 14 b
)
select t1.* from t1 where not exists (select 1 from t2 where t2.a = t1.a and t2.b < t1.b)
) order by 1, 2;
";
let actual = ctx.sql(sql).await?.collect().await?;
let expected: Vec<&str> = vec![
"+----+----+",
"| a | b |",
"+----+----+",
"| 11 | 12 |",
"+----+----+",
];
datafusion_common::assert_batches_eq!(expected, &actual);
Ok(())
}
Expected behavior
Test should pass
Additional context
No response
Check the LeftAnti issue reported in HJ https://github.com/apache/datafusion/issues/10583 to be handled also in SMJ
@comphead can I pick this or are you arleady working on a solution?
Thanks @edmondop I'm already working on it, if you have spare time and would like to work on SMJ its really needed to add fuzz tests for SMJ when the filter set, I'll create a ticket
Filed #10659
Thanks. Will work on #10659 !