cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Supporting an Inplace Update Rollout Strategy for upgrading Workload Clusters

Open dharmjit opened this issue 1 year ago • 5 comments

User Story

As a Platform Operator managing the Kubernetes clusters in resource constraint environments(Non-HA) OR/AND Specialized customized environments, I want to upgrade Kubernetes clusters without rolling out new nodes.

Detailed Description

For use cases such as Single-Node Clusters with no spare capacity or even Multi-Node Clusters with VM/OS customizations for high-performance/low-latency workloads or dependency on local persistent storage, Upgrading a Workload Cluster via RollingUpdate rollout strategy could either be not feasible or a costly operation requiring to re-apply these customizations on newer nodes and hence more downtime.

CAPI uses/promotes the immutable Infrastructure principles for a range of advantages. With the emergence of Image-based OS upgrade techniques such as A/B partition OS upgrades or OSTree Filesystem OS upgrades which provide immutable OS characteristics, We could rethink CAPI providing another rollout strategy to update the K8s/OS for the workload clusters.

At a high level, below could be some of the requirements

  • To introduce a new rollout strategy to allow upgrading workload clusters without rolling out new nodes.
  • To support this new rollout strategy for both clusterclass as well as non-clusterclass clusters.
  • To support this new rollout strategy for both the control plane as well as worker nodes of a workload cluster.
  • To ensure this new rollout upgrade strategy is agnostic of Image-based OS upgrades underlying implementation (OSTree upgrades, AB partition upgrades, etc.)

Note: For highly available clusters in resource constraint environments, CAPI provides strategies like ScaleIn(KCP) and OnDelete(MD) for upgrades without requiring additional infra capacity.

Anything else you would like to add?

There are already some CAPI slack discussions/GH issues discussing in-place upgrade needs and probably folks already have some ideas or more use cases around this. It would be great to hear/discuss those in the comments and probably it would be beneficial to create a Working Group around this feature.

Some GH issues around In-place upgrades/mutability in CAPI and tagging folks part of these discussions

  • #7415
  • #7044

cc: @furkatgofurov7 @pacoxu @fabriziopandini @sbueringer @shivi28

Please feel free to add more folks interested in this feature.

/kind feature /area upgrades

dharmjit avatar Sep 25 '23 07:09 dharmjit