compliantkubernetes-kubespray Can immutable Nodes reduce patching and upgrades burden on the Kubespray layer?

What should be investigated.

Our current patching and upgrade strategy can be summarized as folllows:

unattended-upgrades or Ansible updates the software on the Nodes.
Kured or administrator action reboots the Nodes.

This works, however, the burden starts to pile up.

An alternative approach is to replace Nodes:

We build VM images.
We copy that image to all projects/regions/providers.
We replace Nodes, simply pointing Terraform variables to the new VM image.

What artifacts should this produce.

Unsure what would be the best artifact, but I'd like an answer to the following questions:

How easy is it to reuse our current Kubespray-based processes to build VM images?
How easy is it to configure a built VM image to make it a control plane and/or data plane Node? Specifically, how easy is it to make a new Node join an existing Kubernetes cluster?
Would we need to copy the built VM image to each project? Or would one per region and cloud provider suffice?
Would this actually reduce burden? What about risk?
How would the life of the DevOps team look like if we implemented replacing Nodes?
What would we need to implement to move in this direction?
A recommendation on whether we should move ahead or not.

Apr 19 '22 07:04 cristiklein

So back to ck8s-cluster and ck8s-base-vm 🤔 😄

Apr 19 '22 07:04 OlleLarsson

Is there any immutable support flag/option in kubespray?

Apr 19 '22 07:04 OlleLarsson

@OlleLarsson Some of the ideas in ck8s-cluster and ck8s-base-vm are definitely worth bringing back. However, let us this time stay closer to upstream and put ourselves in a better position for auto-scaling, e.g., this project.

I think Kubespray's offline environments is closest to our use-case, but I haven't evaluated if it ticks all boxes needed for a base VM image.

Apr 19 '22 10:04 cristiklein

If all software is already pre-installed, but not pre-configured, how long time does a run of Kubespray need?

So assuming we would have an Ubuntu image that just so happens to have all the Debian packages installed, would that help speed things up considerably or at least reasonably?

Aug 08 '22 13:08 llarsson

@llarsson Just to write-up my thoughts, at this point I'm unsure whether Kubespray is going to be the future. Hence, your question "how long time does a run of Kubespray need" is good, but might become deprecated soon:tm:. :smile:

Specifically, @Xartos is currently looking into Cluster API with the OpenStack provider as part of https://github.com/elastisys/ck8s-ops/issues/1898#issuecomment-1208195016. That seems to come with its own Ansible roles to build images. Unsure if the Cluster API Ansible roles compete or complement Kubespray. It could be that we'll use Kubespray to set up the non-auto-scaled part of the cluster", i.e., control plane plus Cluster API provider, while the Cluster API Ansible scripts will be used for the auto-scaled part of the cluster.

Aug 09 '22 06:08 cristiklein

I have always thought, and don't know where these thoughts fall on the spectrum of correctness, that we do quite a bit of configuration within our kubespray settings. Settings that could be hard to replicate conveniently with the Cluster API.

Essentially, because CAPI has certain settings and intentions about a particular cookie-cutter type of cluster it should generate, and we have other intentions behind our (security-related) settings.

I look forward to @Xartos' investigation into all of this, because if we can use the Cluster API and therefore the Cluster Autoscaler, I would be very happy! :smiley:

Aug 09 '22 06:08 llarsson

closing as we will go with cluster api

Dec 30 '22 10:12 crssnd

compliantkubernetes-kubespray compliantkubernetes-kubespray copied to clipboard

Can immutable Nodes reduce patching and upgrades burden on the Kubespray layer?

compliantkubernetes-kubespray
compliantkubernetes-kubespray copied to clipboard