risingwave
risingwave copied to clipboard
optimizer: retain MultiJoin in optimized logical plan
The optimized logical plan should contain a multi join node (and probably with join order). We should expand it in to_stream or to_batch.
One reason is that it makes index delta join easier to implement. In to_stream
, if we match indices, we can directly convert it into a delta join plan.
But there're cons. We will need to do predicate pushdown and some other rules in gen stream plan / gen batch plan, which is not well-tested currently.
Had a disccusion with @st1page today, and I would +1 for this.
The cons is: LogicalMultiJoin
.to_stream
and to_batch
become complicated, it needs to do 1) Join reordering and 2) Select which physical operators for Join
, and these are typically the hardest part of SQL optimizer. 😇
We can decide join ordering in MultiJoin node (when doing logical optimization)
I think the join reorder in stream and batch might be different, so -1 for using the same join reorder path in the logical phase, but we can reuse the code like heuristic_ordering
.
btw, will we use stream look-up join for all the inner join or use stream look-up join for input with index? if not, we might need to use lookup join at some parts of the multijoin and use stream join for other part.
We will need to do predicate pushdown and some other rules in gen stream plan / gen batch plan
- for the batch plan, we can maintain the logical predicate push down before
to_batch
- for the stream plan, I think it might be by design. before merge the inner joins into multijoin, we have pushed the predicate. so the predicate push down is just to push the predicate above multijoin to the joins' on condition, and we should do this push down on the rewritten lookup-join plan.