[FLINK-33192] Fix Memory Leak in WindowOperator due to Improper Timer Cleanup
What is the purpose of the change
- This Pull Request addresses an important state memory leak issue identified within the default window operator of Apache Flink.
- After this change, the cleanup timer should be registered for every window that's added to the window state regardless of it emitting a result after it’s fired.
- This change is associated to the following bug: FLINK-33192
Brief change log
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator
Verifying this change
- WindowOperatorTest is working for functional correctness.
- Memory Leak Prevention Validation: Due to the nature of this change aimed at preventing memory leaks, conducting validation via a unit test presents challenges.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving): no - The serializers: no
- The runtime per-record code paths (performance sensitive): yes
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
CI report:
- d745169b106e3d20a90a744678a7ecbede5fd55d Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:-
@flinkbot run azurere-run the last Azure build
@kartikeypant Thanks for the PR, overall LGTM, would you like to add a test about this in WindowOperatorTest?
Hey @fredia, thank you for your feedback!
I have added the test testCleanupTimerWithEmptyStateNoResultForTumblingWindows (as specified) in WindowOperatorTest and updated the description of the PR with the same.
Could you please review the changes and let me know if any further adjustments are needed?
Sure, thanks @fredia for the guidance - will back-port these changes to v1.18/1.19/1.20 as well as suggested.
@flinkbot run azure