KiteSQL icon indicating copy to clipboard operation
KiteSQL copied to clipboard

Feat: Support correlated subqueries

Open crwen opened this issue 1 year ago • 4 comments

Feature Request

there is something wrong when binding correlated subqueries because parent context is ignored.

explain select a,b from t where a in (select a from t1 where a > t.a);
ERROR:  invalid table: t

After fix this bug, we may have the plan like this:

                               PLAN
------------------------------------------------------------------
 Projection [t.a, t.b] [Project]                                 +
   LeftSemi Join On t.a = (t1.a) as (_temp_table_0_.a) [HashJoin]+
     Scan t -> [a, b] [SeqScan]                                  +
     Projection [(t1.a) as (_temp_table_0_.a)] [Project]         +
       Projection [t1.a] [Project]                               +
         Filter (t1.a > t.a), Is Having: false [Filter]          +
           Scan t1 -> [a] [SeqScan]

Filter (t1.a > t.a) can't be handle correctly because it doesn't have the data of t. So we need to pull up this filter.

Maybe we could pull up all filters, and then push them down. so the things we should do next is here:

  • [ ] fix bind bug on correlated subqueries #191
    • [x] rollback on https://github.com/KipData/FnckSQL/pull/252
  • [ ] pull filter up
  • [ ] push filter down

crwen avatar Mar 30 '24 13:03 crwen

It seems that there is something wrong with HashJoin, I'm debugging now, I'll merge this first

KKould avatar Mar 30 '24 14:03 KKould

It seems that there is something wrong with HashJoin, I'm debugging now, I'll merge this first

ops, I found the problem. In correlated subqueries, there is no left data in right. so this case will return empty data (wrong)

let fnck_sql = DataBaseBuilder::path(temp_dir.path())
    .build()
    .await?;
let _ = fnck_sql
    .run("create table t2(id int primary key, a int not null, b int not null);")
    .await?;
let _ = fnck_sql
    .run("create table t3(id int primary key, a int not null, c int not null);")
    .await?;
let _ = fnck_sql
    .run("insert into t2 values (0, 1, 2), (3, 4, 5), (8, 0, 0);")
    .await?;
let _ = fnck_sql
    .run("insert into t3 values (0, 2, 2), (3, 8, 5);")
    .await?;
let (schema, tuples) = fnck_sql.run("select id,a,b from t2 where id in (select a from t3 where a > t2.a);").await?;
println!("{}", create_table(&schema, &tuples));

there is only t3.a on the right but no t2.a, so filter always returns false

+--------------------------------------------------------------------+
| PLAN                                                               |
+====================================================================+
| Projection [t2.id, t2.a, t2.b] [Project]                           |
|   LeftSemi Join On t2.id = (t3.a) as (_temp_table_0_.a) [HashJoin] |
|     Scan t2 -> [id, a, b] [SeqScan]                                |
|     Projection [(t3.a) as (_temp_table_0_.a)] [Project]            |
|       Projection [t3.a] [Project]                                  |
|         Filter (t3.a > t2.a), Is Having: false [Filter]            |
|           Scan t3 -> [a] [SeqScan]                                 |
+--------------------------------------------------------------------+

I think this problem is caused by NormalizationRule: PushPredicateThroughJoin

KKould avatar Mar 30 '24 14:03 KKould

I think this problem is caused by NormalizationRule: PushPredicateThroughJoin

I don't think it would help. Filter is never pulled up since binding, so it won't be pushed down.

crwen avatar Mar 30 '24 15:03 crwen

yep, I made a mistake. This seems to be a feature that has not yet been implemented.

KKould avatar Mar 30 '24 15:03 KKould