flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-35157][runtime] Sources with watermark alignment get stuck once some subtasks finish

Open elon-X opened this issue 9 months ago • 2 comments

What is the purpose of the change

Sources with watermark alignment get stuck once some subtasks finish, this PR solves this problem.

Brief change log

while some subtasks have been finished, the SourceOperator send Long.MAX_VALUE to SourceCoordinator, and SourceCoordinator checks whether subtasks have been finished before sending the event.

Verifying this change

This change added tests and can be verified as follows:

  • org.apache.flink.streaming.api.operators.SourceOperatorAlignmentTest::testWatermarkAlignmentWhileSubtaskFinished()
  • org.apache.flink.runtime.source.coordinator.SourceCoordinatorAlignmentTest::testWatermarkAlignmentWhileSubtaskFinished()

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

elon-X avatar May 06 '24 14:05 elon-X

CI report:

  • 8ac3d8af3aefac55f13a37ed969abd79fc97a65a Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar May 06 '24 14:05 flinkbot

hi, @1996fanrui would you mind reviewing this for me when you have a moment? Thank you very much!

elon-X avatar May 07 '24 02:05 elon-X

@1996fanrui I've made some changes based on your suggestions. Please review the changes when you have a chance and let me know if there are any further improvements needed. Thanks!

elon-X avatar Jun 05 '24 05:06 elon-X

@elon-X The CI fails, could you rebase the master branch first? We can follow the CI after rebaseing.

1996fanrui avatar Jun 13 '24 06:06 1996fanrui