dolt icon indicating copy to clipboard operation
dolt copied to clipboard

Nested subquery problems

Open max-hoffman opened this issue 2 years ago • 0 comments

The subquery below has an inner EXISTS that can be decorrelated into a join, but the resulting filter references a column from an outer subquery expression:

select x from xy where y in (select xy.x from xy join (select t2.y from xy t2 where exists (select t3.y from xy t3 where t3.y = xy.x)) t1);

Basically, unnesting only incrementally decorrelates scopes. We would need to pull t3.y = xy.x all the way to the root to convert this whole tree into a join.

The query below can also be unnested safely in one pass, but we place the correlated filter at the [xy_1][t1] edge, rather that at the top of the tree at [xy][xy_1 x t1]:

select x from xy where y in (select x from xy where exists (select y from xy t1 where t1.y = xy.x));

I am making fixes for this not to panic, but there are a few source issues I'm not addressing:

  • Unnesting EXISTS and IN should happen before all join planning and probably be recursive. Otherwise, we seem to interleave unnesting and join planning.
  • The two should use similar code to avoid inconsistencies/having to fix the bugs in two places. EXISTS could probably be normalized to IN subqueries.
  • Join planning currently crashes when we see a join filter from an outer scope. Ideally we would have 1) unnested this join already, but also 2) technically we could treat out of scope column references as constants.

max-hoffman avatar Dec 04 '23 20:12 max-hoffman