cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Bottlerocket OS support in kubeadm bootstrap

Open g-gaston opened this issue 2 years ago • 12 comments

User Story

As an operator I would like to manage clusters using Bottlerocket as the machines' OS for its security and reliability for container-based workloads.

Detailed Description

I would like to add support for Bottlerocket to the kubeadm bootstrap provider. Bottlerocket doesn't support neither cloud-init not ignition so this would require updating the API and controllers.

The most straight forward option would be to add a third configuration format. In fact, that's how we implemented it initially and have been operating it for quite some time.

However, due to the special architecture of Bottlerocket, the way this implementation works internally is quite different from cloud-init (not sure about ignition, I haven't looked deeply into its implementation). For example, kubelet is controlled by the OS and commands are not run in the host machine but in a container. This requires to include Bottlerocket specific config (like the startup container image) and also makes some of the kubeadm configs, like disk setup or pre/post kubeadm commands, to be not directly transferable to BR.

Moreover, there are certain features supported by Bottlerocket that users might want to configure (like container registry mirror) and that might not be transferable to other formats/OSs, requiring to add more Bottlerocket specific config. This would pollute more the KubeadmConfig and further break the current abstraction (OS agnostic config that is transformed to different formats for the machines user-data).

All of this makes me think it could be a good idea to think about decoupling the OS bootstrapping (config, format, etc.) from the kubeadm bootstrapping (kubeadm config generation, commands, etc.). That would allow us to add support more OSs independently while reusing the core kubeadm providers functionality. It seems that this idea has been proposed before, although I believe that issue adds more requirements, increasing the scope of the solution.

Looking for the community's feedback:

  • Would it be interesting to support Bottlerocket?
  • How should we proceed?
    • Are we ok to continue further with the pattern followed by cloud-init and ignition?
    • Or should we pursue a possible OS bootstrapping decoupling before adding Bottlerocket support?

/kind feature

g-gaston avatar Jan 03 '23 20:01 g-gaston

/assign

I work on Bottlerocket, and was hoping to get to this sometime soon. So I'm going to assign this to me for now and start poking around.

I don't have a definite timeframe that I can plan on getting something working, so if anyone else comes along and really, really wants to dive in, just let me know! Otherwise I'll try to get a plan together soon and see what it will take. 😁

stmcginnis avatar Jan 04 '23 14:01 stmcginnis

@stmcginnis I was offering myself to do this work since we already have a working implementation, but if you have bandwidth, that's awesome! :)

g-gaston avatar Jan 04 '23 16:01 g-gaston

/unassign /assign @g-gaston

Awesome, thanks @g-gaston. I don't currently have the bandwidth, so I'll unassign. I'll watch for any changes and help review though. Just let me know if I can help in any way.

stmcginnis avatar Jan 04 '23 16:01 stmcginnis

  • Would it be interesting to support Bottlerocket?
  • How should we proceed?
    • Are we ok to continue further with the pattern followed by cloud-init and ignition?
    • Or should we pursue a possible OS bootstrapping decoupling before adding Bottlerocket support?

I personally don't have any requirements on Bottlerocket so can't speak to the first point but I do want to comment on the second.

As you point out,

This would pollute more the KubeadmConfig and further break the current abstraction (OS agnostic config that is transformed to different formats for the machines user-data).

We had many discussions during the implementation of Ignition to support Flatcar. The same argument of "the issue adds more requirements, increasing the scope of the solution" was put forth as a reason to add Ignition into CABPK as a format option as a short-term solution. The consensus was that we as a community would do the work to more cleanly separate OS requirements from k8s bootstrappers as a long-term solution. I can see why it might be tempting to look at the current status quo and replicate it. It might seem like less work to extend KubeadmConfig too. However, I believe it would greatly benefit the maintenance costs long term to invest in making this cleaner separation of concern. Even in the short term, I don't think adding Bottlerocket to the current bootstrapper will be a quick and easy solution. For consideration, the PR to add Ignition (https://github.com/kubernetes-sigs/cluster-api/pull/4172) took 10 months to merge (there might be other factors involved but from my recollection, the added technical debt in CABPK was the main cause of concerns for reviewers).

My vote is for option 2.

/cc @vincepri @fabriziopandini @sbueringer @enxebre

CecileRobertMichon avatar Jan 04 '23 21:01 CecileRobertMichon

Would it be interesting to support Bottlerocket?

I think it'd be good to explore a roadmap to materialise the decoupling in https://github.com/kubernetes-sigs/cluster-api/issues/5294 and hopefully enabling an easier adoption path for any OS generally.

How should we proceed? Are we ok to continue further with the pattern followed by cloud-init and ignition? Or should we pursue a possible OS bootstrapping decoupling before adding Bottlerocket support?

I concur with @CecileRobertMichon. Last time we discuss this topic I think there was general consensus on that adding another OS specific to the current bootstrapper was a no-go as it's not sustainable nor scalable. I'd suggest 2 as well and we can use Bottlerocket as the use case to prove the solution is valid.

enxebre avatar Jan 05 '23 14:01 enxebre

I agree with all that Alberto and Cecile wrote. Let's explore decoupling. I know it's (probably) more effort but the addition of ignition really sent us down a path that we shouldn't go further.

sbueringer avatar Jan 05 '23 15:01 sbueringer

+1 on exploring decoupling, I don't think that bootstrapping mechanisms should leak into cabpk or bootstrappers in general

yastij avatar Jan 05 '23 16:01 yastij

/triage accepted I agree with all that Alberto and Cecile wrote. Happy to help in defining the path forward

fabriziopandini avatar Jan 11 '23 11:01 fabriziopandini

Thank you everyone!

If everyone is ok with it, I'm more than happy to start the initiative for an OS bootstrapping decoupling proposal. I'll start by compiling a list of requirements/usecases, so I'll probably be reaching out to some of you shortly.

g-gaston avatar Jan 12 '23 18:01 g-gaston

@g-gaston , feel free to reach out on Slack if you want any background on the earlier proposal.

randomvariable avatar Feb 01 '23 10:02 randomvariable

Hello there, I work in Bottlerocket as well (Hi @stmcginnis!). I worked on the bootstrap containers implementation, and I just wanted to drop a few notes here:

  • Unlike Ignition, bootstrap containers are just that, containers. There isn't any special format required to use them, folks can use any container image to bootstrap a host, so probably the implementation might be simpler(?)
  • I don't know how "k8s bootstrappers" work, but if the OS bootstrapping operations change (say, commands are dynamically generated), it might make sense to use a bootstrap container that executes anything that is passed as its user data
  • Caveat of bootstrap containers, folks might need to maintain a "k8s bootstrapper" container with the one shell script that knows how to bootstrap the node, unless the idea proposed above is simpler to implement
  • Lastly, bootstrap containers run before the kubelet starts, so, if there is anything that "k8s bootstrappers" do that require the kubelet to be running alongside them, it won't be possible to fulfill the need through bootstrap containers.

I'm happy to answer any questions if anything I said is confusing, or if there are more unanswered questions down the road (unless @stmcginnis beats me!).

arnaldo2792 avatar Feb 08 '23 18:02 arnaldo2792

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Feb 08 '24 19:02 k8s-triage-robot

/priority backlog

fabriziopandini avatar Apr 11 '24 18:04 fabriziopandini

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

Also, this seems blocked by https://github.com/kubernetes-sigs/cluster-api/issues/5294, so let's solve this one first /close

fabriziopandini avatar May 02 '24 12:05 fabriziopandini

@fabriziopandini: Closing this issue.

In response to this:

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

Also, this seems blocked by https://github.com/kubernetes-sigs/cluster-api/issues/5294, so let's solve this one first /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 02 '24 12:05 k8s-ci-robot