kubeadm icon indicating copy to clipboard operation
kubeadm copied to clipboard

Tracking issue for changing the cluster (reconf)

Open neolit123 opened this issue 6 years ago • 17 comments

This is the tracking for "change the cluster": The feature request is to allow easy to use UX for users that want to change properties of a running cluster.

existing proposal docs: TODO

kubeadm operator: https://github.com/kubernetes/kubeadm/issues/1698

User story: https://github.com/kubernetes/kubeadm/issues/1581

neolit123 avatar Jul 04 '18 14:07 neolit123

In order to address the issue IMO kubeadm should clearly split updates (changes to the cluster configuration) from upgrades (change of release) by removing any option to change the cluster config during upgrades and creating a separated new kubeadm update/apply action.

Main rationale behind this opinion

  • the complexity of the upgrade workflow
  • the size of the test matrix for all the supported permutations of type of clusters/change of release/possible changes to the cluster configuration
  • the current test infrastructure

fabriziopandini avatar Jul 04 '18 15:07 fabriziopandini

/kind feature too late for 1.12, can be addressed in 1.13.

neolit123 avatar Sep 17 '18 21:09 neolit123

@neolit123 I have a KEP in flight for this

fabriziopandini avatar Sep 18 '18 19:09 fabriziopandini

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Apr 07 '19 17:04 fejta-bot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot avatar May 07 '19 17:05 fejta-bot

Do we still need this?

ezzoueidi avatar May 24 '19 00:05 ezzoueidi

i've removed the help-wanted here. this comment by @fabriziopandini still applies: https://github.com/kubernetes/kubeadm/issues/970#issuecomment-402507153

the way --config for apply works might have to be changed.

this overlaps with the Kustomize ideas: https://github.com/kubernetes/kubeadm/issues/1379

/lifecycle frozen

neolit123 avatar May 25 '19 14:05 neolit123

@fabriziopandini can this be renamed as the ticket for "change the cluster"? also link to the KEP has changed.

neolit123 avatar Aug 03 '19 00:08 neolit123

@neolit123 IMO there are several tickets tracking slightly different variations of the same topic, which is changing the configuration of a running cluster. My proposal is to close this and conflate everything under #1698

fabriziopandini avatar Aug 05 '19 09:08 fabriziopandini

What's the current status of this?

I think it's common to update some configs after kubeadm init, but I couldn't find any doc addressing this in a proper way, yet.

I personally expect something like:

kubeadm config update [flags]

or

kubeadm update [flags]

So that we can update any components' config, especially ApiServer, ControllerManager, in a streamlined way. Thank you!

brightzheng100 avatar Oct 09 '19 04:10 brightzheng100

@brightzheng100 see https://github.com/kubernetes/kubeadm/issues/1698

fabriziopandini avatar Oct 09 '19 07:10 fabriziopandini

That solution looks complicated and please proceed to make it a complete one!

Anyway, I tried it out by simply using the kubeadm upgrade apply command and it worked after some failures and experiments.

A detailed case could be referred from here -- hope it helps.

brightzheng100 avatar Oct 09 '19 08:10 brightzheng100

Anyway, I tried it out by simply using the kubeadm upgrade apply command and it worked after some failures and experiments.

it's really not recommended and i'm trying to deprecate it because of the "failures" part that you mention. also its really not suited to reconfigure multi-control plane setups.

the existing workaround to modifying the cluster is:

  • modify the kubeadm-config ConfigMap with your new values.
  • modify the coredns and kube-proxy ConfigMap to match the kubeadm-config changes if needed.
  • go to each node and modify your /etc/kubernetes/manifests files.

with a proper SSH setup this is not that complicated of a bash script, but still not the best UX for new users.

neolit123 avatar Oct 09 '19 11:10 neolit123

@brightzheng100 thanks for your feedback. UX is a major concern and this is why we are prototyping around this proposal

fabriziopandini avatar Oct 09 '19 11:10 fabriziopandini

the existing workaround to modifying the cluster is:

  • modify the kubeadm-config ConfigMap with your new values.

This is not required based on my experiments. As once we drive things by using kubeadm-config.yaml, the kubeadm-config ConfigMap will be updated accordingly.

  • modify the coredns and kube-proxy ConfigMap to match the kubeadm-config changes if needed.

I haven't found any reason yet to update these ConfigMaps manually, if we may just want to enable/disable some features in kube-apiserver. But I really found that sometimes the coredns pods would become CrashLoopBackOff.

  • go to each node and modify your /etc/kubernetes/manifests files.

Currently I have a single-master env so haven't tried it out yet, but yup I think we have to sync up these static pods' manifests.

Frankly, building a kubeadm operator sounds a bit overkilled from my perspective (and of course I may be wrong). Again, I'm expecting a simple command, like this:

kubeadm config update [flags]

brightzheng100 avatar Oct 09 '19 14:10 brightzheng100

This is not required based on my experiments. As once we drive things by using kubeadm-config.yaml, the kubeadm-config ConfigMap will be updated accordingly.

joining new control-planes to the cluster would still need an updated version of ClusterConfiguration.

I haven't found any reason yet to update these ConfigMaps manually, if we may just want to enable/disable some features in kube-apiserver. But I really found that sometimes the coredns pods would become CrashLoopBackOff.

sadly, there are many reasons for the coredns pods to enter a crashloop, best way is to look in the logs. if nothing works removing the deployment and re-applying a CNI plugin too should fix it.

Currently I have a single-master env so haven't tried it out yet, but yup I think we have to sync up these static pods' manifests.

that is why kubeadm upgrade apply --config is not a good workaround for multi-control plane scenarios.

Frankly, building a kubeadm operator sounds a bit overkilled from my perspective (and of course I may be wrong).

i agree, for patching CP manifests on a single-CP you are better of just applying manual steps instead of the operator.

kubeadm config update [flags]

a similar approach was discussed, where we execute "a command" on all nodes to apply upgrade / re-config, but we went for the operator instead because that's a common pattern in k8s.

neolit123 avatar Oct 09 '19 14:10 neolit123

For me (on Kubernetes 1.17.11) the following worked.

  1. Edit the ConfigMap with my changes:
kubectl edit configmap -n kube-system kubeadm-config
  1. On all nodes run:
sudo kubeadm upgrade apply <version>

where <version> is the current kubeadm version (in my case 1.17.11).

Additional information

  • All of my 3 nodes are control plane nodes.
  • I only changed data.ClusterConfiguration.apiServer.extraFlags.

haslersn avatar Sep 15 '20 15:09 haslersn