flink
flink copied to clipboard
[FLINK-33743][runtime] Support consuming multiple subpartitions on a single channel
What is the purpose of the change
This pull request adds support for consuming multiple subpartitions in a single input channel.
Brief change log
- Separate the notion of subpartition and channel in terms of naming, signature and comments.
- Union the output from multiple subpartitions with one ResultSubpartitionView and TierConsumerAgent
- Control the partial record split logic when writing buffers into subpartition to avoid potential deadlocks
Verifying this change
This change added tests and can be verified as follows:
- Extended HybridShuffleITCase to cover cases when the feature proposed in this PR comes into effect.
- Added unit tests for newly introduced options and classes
- Manually verified situations that are not covered by e2e tests, like when tiered hybrid shuffle uses sort buffer.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving)
: no - The serializers: yes
- The runtime per-record code paths (performance sensitive): yes
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? no need to document
CI report:
- c1ad90d326a8c20ab78a8809f65effd7c9a3a66c Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:-
@flinkbot run azure
re-run the last Azure build
Hi @TanYuxin-tyx could you please take a look at this PR?
@yunfengzhou-hub Thanks for the update. I have no more comments on the change.
Hi @reswqa Could you please take a look at this PR?