cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Gather user cases for kubeadm operator from CAPI side

Open pacoxu opened this issue 3 years ago • 2 comments

User Story kubeadm operator was discussed many times before. Generally, it can handle

  • kubeadm cluster upgrade (some dry-run or prechecks)
  • kubeadm configuration changes
  • certs rotation

We want to gather a list of use cases from the CAPI side.

Detailed Description

I tried to build a kubeadm operator that can help users on day2. I opened a thread in https://groups.google.com/g/kubernetes-sig-cluster-lifecycle/c/LMAABdj31DI as well.

Anything else you would like to add:

Descriptions about the current kubeadm operator status.

Not sure if this is the right place to discuss on kubeadm operator. There are some threads in https://github.com/kubernetes/kubeadm/issues/2317 and kubernetes/enhancements#2505.

I write a simple kubelet-reloader as a tool for kubeadm operator.

  • kubelet-reloader will watch on /usr/bin/kubelet-new.
  • once there is a different version of kubelet-new, the reloader will replace /usr/bin/kubelet and restart kubelet.
  • todo: verify the configuration of kubelet and version before replacing it.

Currently the kubeadm-operator v0.1.0 can support upgrade cross versions like v1.22 to v1.24.

  • kubeadm operator will download kubectl/kubelet/kubeadm and upgrade.(The current logic will download the binary directly, and I am not sure if yum upgrade/apt upgrade would be better.)
  • kubelet will be placed in /usr/bin/kubelet-new for kubelet reloader.

See quick-start.

Some thoughts on the next steps

My version https://github.com/pacoxu/kubeadm-operator is based on Fabrizio's first implementation https://github.com/kubernetes/kubeadm/pull/2342 which is following the KEP https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2505-Kubeadm-operator. BTW, https://github.com/chendave/kubeadm-operator is a similar project to mine.

Hope to receive your feedback and suggestions, or requirements on kubeadm operator.

/kind feature

pacoxu avatar Aug 10 '22 09:08 pacoxu

@pacoxu: This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 10 '22 09:08 k8s-ci-robot

/cc @chrischdi seems like something you might be interested in! 🙂

killianmuldoon avatar Aug 10 '22 10:08 killianmuldoon

cc @fabriziopandini

sbueringer avatar Aug 18 '22 10:08 sbueringer

@pacoxu thanks for reaching out with this issue. As you probably know cluster API promotes the idea of "immutable" infrastructure, and so every mutation happens by creating a new Machine/deleting the old one.

There is some starting discussion about mutability in CAPI (cc @enxebre), and personally, I would like to be more involved in it and work to a high-level design doc; IMO there are a few areas this document should figure out:

  • The API, modeling the current and desired state of components and providing the UX for mutable changes
  • The operating model, that most probably will be a mix of mutability and immutability
  • The scope and boundairies of mutability in CAPI; e.g. os or kernel upgrades are in scope or not for mutability
  • The split of responsibilities between different layers of the stack: what is in CAPI core, what in providers, what else?

When we will get to the last point, this is probably where we will start to figure out if something like the kubeadm operator can be relevant for CAPI, and how/which use cases it can cover. In the meantime, I will try to join the kubeadm office hours to follow discussion there and to brainstorm about this idea

fabriziopandini avatar Aug 28 '22 14:08 fabriziopandini

@fabriziopandini Just watched the kubeadm office hour record. https://mail.google.com/mail/u/0/#inbox/FMfcgzGqPpgGlrWbcSVJpPnNjmbXXcQf is updated.

Thanks for your advice and comments on it. More feedback and discussions are needed. For the design, I need to think again about a better design and limitations.

pacoxu avatar Sep 01 '22 15:09 pacoxu

This operator provide something just like rpm-ostree with --apply-live. rpm-ostree focus on immutable and --apply-live provide some convenience. It is useful but can not cover all the cases. We still have chances need to replace like runc/containerd.

Sorry, I'm new to this project. Just write some comments from my user experience.

Thanks @pacoxu

shawn111 avatar Sep 07 '22 12:09 shawn111

For kubeadm-operator topic, I think runc/containerd is out of the scope.

For package management of apt/yum, it may be part of it for kubelet/kubeadm/kubectl upgrade. I think there are several choices.

  1. download binary to replace directly. Simple but not cover all the cases.
  2. upgrade using yum/apt with a configured repo.
  3. use rpm for offline(or use local repo with rpms like choice 2).

pacoxu avatar Sep 15 '22 03:09 pacoxu

@pacoxu If you consider manage binary out of kubelet/kubeadm/kubectl, do you think luet (https://luet.io/) is kind of good solution? Luet is a Package Manager based on containers and packages are stored in container registry. But those might out of CAPI design.

shawn111 avatar Sep 15 '22 14:09 shawn111

Let's keep this discussion open to channel feedback to @pacoxu /triage accepted

fabriziopandini avatar Nov 30 '22 20:11 fabriziopandini

@pacoxu @fabriziopandini hi folks!

Found this while exploring the options for automatic CA rotation in CAPI (issue: https://github.com/kubernetes-sigs/cluster-api/issues/7721) and this looks to be spot on! I am mostly interested in kubeadm operator and it is usage in the context of bare-metal provider implementation of CAPI. Copy pasting the use-cases from the 7721 here for better reach:

There are also some cases in which the CA of the target clusters might be
different from that of the management cluster. Some use cases:

1. Deploy of management cluster and many target clusters with the same CA. 
Perform the cluster CA rotation on the target clusters and the management clusters without impact on traffic
2. Deploy of management cluster and many target clusters with different CA. 
Perform the cluster CA rotation on the target clusters and the management clusters without impact on traffic

Also @pacoxu have not got to the bottom of the initial discussion yet, but is this https://github.com/pacoxu/kubeadm-operator implements https://github.com/kubernetes/kubeadm/issues/1698 and a prototype we can give a try already?

furkatgofurov7 avatar Dec 12 '22 12:12 furkatgofurov7

(doing some cleanup on old issues without updates) /close this work belongs to the kubeadm repo, if people have more use case they can report it there

fabriziopandini avatar Mar 24 '23 19:03 fabriziopandini

@fabriziopandini: Closing this issue.

In response to this:

(doing some cleanup on old issues without updates) /close this work belongs to the kubeadm repo, if people have more use case they can report it there

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 24 '23 19:03 k8s-ci-robot