risingwave
risingwave copied to clipboard
fragmenter: remove 1v1 exchange rule & move the optimization to compute node
We've introduced the "1v1 exchange rewrite" in #1745 that split multiple stateful operators into different fragments, then connect them with no-shuffle
exchange. However, this breaks the assumption that every fragment can be scheduled or scaled independently: if we want to scale out one of the fragments, either we need to resolve the related upstream/downstream fragment and scale them alongside, or we need to replace all dispatchers with hash dispatchers. This lead to extra complexity for the meta service and the cloud manager.
Considering that our purpose is to increase the I/O concurrency, and the benchmark show that the compute parallelism is good enough, I suggest removing this rule(rewrite) from the fragmenter and letting the actor in compute nodes decide whether to do this optimization: multiple stateful executors are still in a single actor/fragment logically, while the actor may join
multiple ActorStage
to achieve I/O concurrency.
Any ideas are welcome. cc @skyzh @st1page @fuyufjh @shanicky
Update: A more detailed doc describing the issue: https://singularity-data.quip.com/GU2ZAhJdBhCJ/The-Future-of-No-shuffle-Exchange
+1
Any doc for ActorStage
? This batch-streaming naming looks very interesting.
dup w/ https://github.com/singularity-data/risingwave/issues/3607, I've proposed it long before!
+1. No shuffle exchange looks bad when there's scale-in / scale-out. If we don't want to handle this as a special case when scale, this proposal looks good.
may also close https://github.com/singularity-data/risingwave/issues/3607
Closed via #5449.