materialize
materialize copied to clipboard
Differential join can pick plan that reuses less arrangements when there are more than two inputs
Seen in a prospect query. There are no keys on foo, bar, or baz. There is only an index on foo(a).
select <stuff>
from foo
inner join bar on foo.a = bar.a
inner join baz on foo.b = baz.b
The plan came out to something like this:
%0 =
| Get foo
| ArrangeBy (b)
%1 =
| Get bar
| ArrangeBy (a)
%2 =
| Get baz
%3 =
| Join %0 %1 %2 (= %0.a %1.a) (= %0.b %1.b)
| | implementation = Differential %2 %0.(b) %1.(a)
| | demand = <stuff>
A better plan would've been
%0 =
| Get foo
| ArrangeBy (a)
%1 =
| Get bar
| ArrangeBy (b)
%2 =
| Get baz
%3 =
| Join %0 %1 %2 (= %0.a %1.a) (= %0.b %1.b)
| | implementation = Differential %1 %0.(a) %2.(b)
| | demand = <stuff>
since this would allow reusing the index on foo(a).
Currently, the way that a differential join plan is picked is that each input in the plan except for the first one gets a rating (technically called a characteristic).
- The plan
%2 %0.(b) %1.(a)gets the set of ratings [(unique_key: false, key_length 1, prexisting_arrangement: false, input: 0), (unique_key: false, key_length: 1, prexisting_arrangement: false, input 1)] - The plan
%1 %0.(a) %2.(b)gets the set of ratings [(unique_key: false, key_length 1, prexisting_arrangement: true, input: 0), (unique_key: false, key_length: 1, prexisting_arrangement: false, input: 2)] Then we take the worst rating from each plan and compare them to one another. So we are comparing - (unique_key: false, key_length: 1, prexisting_arrangement: false, input 1) (from plan
%2 %0.(b) %1.(a)) - (unique_key: false, key_length: 1, prexisting_arrangement: false, input: 2) (from plan
%1 %0.(a) %2.(b)) which are equally bad. The tiebreaker is the input number. Because 1 < 2, plan%2 %0.(b) %1.(a)wins.
A better tiebreaker would be the second worst rating from each plan, third worst rating, and so on.
The workaround is to create a subview:
create view subview as
select foo.b as b, <stuff>
from foo
inner join bar on foo.a = bar.a
select <stuff>
from subview
inner join baz on subview.b = baz.b