flink-kubernetes-operator
flink-kubernetes-operator copied to clipboard
[FLINK-35265] Implement FlinkStateSnapshot custom resource
What is the purpose of the change
Implement FlinkStateSnapshot as according to FLIP-446. This PR does not include the e2e-tests and documentation.
Brief change log
- Added
FlinkStateSnapshot
and all its dependent classes toflink-kubernetes-operator-api
- Deprecated several fields in
FlinkDeployment
/FlinkSessionJob
as accepted in the FLIP - Refactored several methods in
FlinkService
to extract the logic of saving snapshot path to other classes - Added test in
FlinkConfigManager
class to check if the CRFlinkStateSnapshot
can be created on the current Kubernetes server during runtime. This is intended to be temporary to ensure a smooth upgrade process. - Refactored metric- and status-related classes to be able to handle the new CR
Verifying this change
- Added unit-tests for new features
- Manual testing
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the
CustomResourceDescriptors
: yes - Core observer or reconciler logic that is regularly executed: yes
Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? not documented, there will be a separate PR for that before this one gets merged
Other implementation details
- In case of a periodic snapshots, the Operator will create new
FlinkStateSnapshot
CRs, and the snapshot will be taken when that resource is reconciled. Its labels are not final yet. - In case of upgrade snapshots, the Operator will create a new
FlinkStateSnapshot
CR, marking it withalreadyExists
. - Manual snapshots won't work with
savepointTriggerNonce
with the new CR, the user is expected to createFlinkStateSnapshot
CRs themselves. - Two new configurations were also added that were not specified in the FLIP:
-
periodic.savepoint.dispose-on-delete
-
job.upgrade.savepoint.dispose-on-delete
-
- Other metrics and configurable max history age/count will be implemented in FLINK-35492 and FLINK-35493 respectively.