[FLINK-36663][Window]Fix the first processWatermark has extra data after restore by restore timeService's watermark.
What is the purpose of the change
Restore the currentWatermark of WindowTimerService. If not restored, the slidingWindow will output extra data.
Brief change log
Restore the currentWatermark of WindowTimerService.
Verifying this change
This change added tests and can be verified as follows:
- Added test that validates whether the hopping window restores the output data consistently.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving): no - The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
CI report:
- 373adb9fe0cf38ecbc2a842bc684aa0da3c6bc43 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
@xing1mo The Azure tests are failing in testCheckpointRescalingInKeyedState.
Reviewed by Chi on 05/12/24. Asked submitter questions
@xing1mo The Azure tests are failing in
testCheckpointRescalingInKeyedState.This is a known problem and there is a certain probability that it will occur. https://issues.apache.org/jira/browse/FLINK-36613
@flinkbot run azure
@flinkbot run azure
@flinkbot run azure
I'm wondering if other window operators have this bug like window join, window rank, window dedup...
Not necessarily. The problem with windowAgg is that the currentWatermark in the timer service is used to determine the window file during agg. I don't know much about the logic of other operators.
Not necessarily. The problem with windowAgg is that the currentWatermark in the timer service is used to determine the window file during agg. I don't know much about the logic of other operators.
OK, I have created a separate jira for it https://issues.apache.org/jira/browse/FLINK-38055
@xing1mo Could you please bp this fix to 2.1 and 2.0?