risingwave icon indicating copy to clipboard operation
risingwave copied to clipboard

Optimizer: push down predicate inferred by hash join equal condition

Open yuhao-su opened this issue 2 years ago • 9 comments

For queries like select * from a, b on a.x = b.x and a.y = b.x, we can push down a filter before side a with condition a.x=a.y and only keep one of the two join condition in hash join executor.

Such optimization can apply to almost all kinds of hash join with the exception of anti-join.

For anti-join we can not push down a filter or keep let join key. Instead, we need to evaluate the predicate in hash join executor. If a.x=a.y is false, we know the join condition a.x = b.x and a.y = b.x is false.

This can also solve the #7698

yuhao-su avatar Mar 06 '23 12:03 yuhao-su

cc. @chenzl25 @st1page

yuhao-su avatar Mar 06 '23 12:03 yuhao-su

It will little more complex when we consider the NULL-safe equal condition :thinking: We can handle them respectively but I am not sure if we can handle this condition left.a is not distinct from right.b AND left.c = right.b. can we push down the condition like left.a = left.c OR left.a is null OR left.c is null? Maybe we need to re-confirm the query requirement from the user side and if we need to handle the NULL-safe equal condition here. IIRC, it is to handle some ring detection. c.c. @lmatz

st1page avatar Mar 06 '23 13:03 st1page

Yeah, in the original use case, a and b are different alias of the same table that is used to represent a graph, where x and y are src and dst of an edge. So I guess we can assume that x and y are both non-null in this particular use case.

lmatz avatar Mar 06 '23 13:03 lmatz

Using an optimization rule to solve a panic sounds a little bit weird to me. It would be a required step to transform plans into a preferred form, rather than an optional "optimization".

xiangjinwu avatar Mar 07 '23 03:03 xiangjinwu

It will little more complex when we consider the NULL-safe equal condition 🤔 We can handle them respectively but I am not sure if we can handle this condition left.a is not distinct from right.b AND left.c = right.b. can we push down the condition like left.a = left.c OR left.a is null OR left.c is null? Maybe we need to re-confirm the query requirement from the user side and if we need to handle the NULL-safe equal condition here. IIRC, it is to handle some ring detection. c.c. @lmatz

Once null safe equal appears, we can just push down a condition like left.a is not distinct from left.c, because it it strict enough to filter out most records, except the null one. Luckily, the upper condition left.a is not distinct from right.b AND left.c = right.b can do this favor for us.

chenzl25 avatar Mar 07 '23 03:03 chenzl25

Using an optimization rule to solve a panic sounds a little bit weird to me. It would be a required step to transform plans into a preferred form, rather than an optional "optimization".

Good point. Although this is a potential way to address the problem, we decided to fix the bug by #8377 and treat this one as an optimization.

yuhao-su avatar Mar 07 '23 04:03 yuhao-su

We do something similar here: https://github.com/risingwavelabs/risingwave/blob/495763326e04d2e622b3ab844db5da72c5fc5a01/src/frontend/src/optimizer/plan_node/logical_join.rs#L837

But this is implemented only for one-sided expression.

Needs futher investigation

By applying this to eq conditions, this could turn the following condition: select * from a, b on a.x = b.x and a.y = b.x

into

select * from a, b on a.x = a.y and a.y = a.x and a.x = b.x, and a.y = b.x

NULL safety reasoning could be extended from here: https://github.com/risingwavelabs/risingwave/pull/6895. In particular: we "can only handle pushing down to an inner side.". Not sure if it is accurate when applied reflexively to eq conditions.

jon-chuang avatar Mar 08 '23 04:03 jon-chuang

Hey, any updates?

fuyufjh avatar Apr 17 '23 09:04 fuyufjh

Hey, any updates?

Because we solve the https://github.com/risingwavelabs/risingwave/issues/7698, it becomes less emergent and a optimization.

st1page avatar Apr 17 '23 09:04 st1page