[FLINK-31949][python] make CountTumblingWindow default trigger to purge all elements on fire
What is the purpose of the change
In FLINK-26444, a couple of convenience window assigners were added to the Python Datastream API, including CountTumblingWindowAssigner. This assigner uses a CountTrigger by default, which produces TriggerResult.FIRE. As such, using this window assigner on a data stream will always produce a "state leak" since older count windows will always be retained without any chance to work on the elements again. This PR use TriggerResult.FIRE_AND_PURGE as default for CountTumblingWindowAssigner to fix state leak.
Brief change log
- refactor to add purge_on_fire boolean variable to CountTrigger constructor
- make CountTumblingWindowAssigner use CountTrigger() with purge_on_fire set to True
Verifying this change
- *Manually verified the change by running a production job with the change for 1 day, observe no JVM heap memory leak with the fix. The job used to restart duo to JVM heap OOM every few hours.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
@Public(Evolving): (no) - The serializers: (no )
- The runtime per-record code paths (performance sensitive): ( no )
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
- The S3 file system connector: (no)
Documentation
- Does this pull request introduce a new feature? ( no)
- If yes, how is the feature documented? (not applicable)
CI report:
- f24dfbbc49aa199ebb2f4c6887a4983a6b8831b2 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
This PR is being marked as stale since it has not had any activity in the last 90 days. If you would like to keep this PR alive, please leave a comment asking for a review. If the PR has merge conflicts, update it with the latest from the base branch.
If you are having difficulty finding a reviewer, please reach out to the community, contact details can be found here: https://flink.apache.org/what-is-flink/community/
If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.
This PR is being marked as stale since it has not had any activity in the last 90 days. If you would like to keep this PR alive, please leave a comment asking for a review. If the PR has merge conflicts, update it with the latest from the base branch.
If you are having difficulty finding a reviewer, please reach out to the community, contact details can be found here: https://flink.apache.org/what-is-flink/community/
If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.