flink-kubernetes-operator
flink-kubernetes-operator copied to clipboard
[FLINK-35126] Rework default checkpoint progress check window
What is the purpose of the change
Currently the checkpoint progress health check window is configurable by Duration. This makes it hard to enable by default as the sensible interval depends on the checkpoint interval.
At the same time the operator already contains logic for a minimum progress check interval computed from the checkpoint timeout , tolerable failures and checkpoint interval.
Furthermore for any job with checkpointing enabled this health check is very valuable to have enabled by default similar to the restart health check. This PR also proposes to enable this feature by default with the the minimum checkpoint check interval set.
Brief change log
- Enable checkpoint progress health check by default (for jobs with checkpointing configured)
- Set the minimum based on the calculated that already enforces the lower bound
Verifying this change
Unit tests + manual verification
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the
CustomResourceDescriptors: no - Core observer or reconciler logic that is regularly executed: yes
Documentation
- Does this pull request introduce a new feature? no