bottlerocket icon indicating copy to clipboard operation
bottlerocket copied to clipboard

Support for running a fully self-managed Kubernetes control-plane on Bottlerocket OS (AWS variant)

Open caiolombello opened this issue 7 months ago • 2 comments

Hello Bottlerocket maintainers,

I’m exploring a 100% self-managed Kubernetes setup on AWS using Bottlerocket OS for both control-plane and worker nodes. My requirements:

  • Deploy EC2 instances with the Bottlerocket AWS variant (aws-k8s-* AMI).
  • Run control-plane components (kube-apiserver, etcd, controller-manager, scheduler) directly on Bottlerocket hosts as static pods or containers.
  • Connect standard Bottlerocket worker nodes to this self-hosted control-plane.

What I’ve discovered so far:

  • The AWS-optimized Bottlerocket AMI only includes kubelet and kube-proxy for worker nodes.
  • The “metal-k8s-*” variant supports standalone mode + static pods but requires bare-metal instance types.

Questions:

  1. Is there a supported way to run control-plane components on the AWS Bottlerocket AMI (aws-k8s-*) on standard EC2 instance types?
  2. If not, are there plans to extend the AWS variant to include standalone-mode or static-pod support for control-plane?
  3. Are there recommended workarounds to achieve a self-hosted control-plane without bare-metal instance types?

Additional Context from my attempts:

  1. Attempted with aws-k8s-1.32 variant using Terraform for infra provisioning, kubeadm init in a host-container, and Bottlerocket settings for standalone mode + control/admin containers.
  2. Encountered issues:
    • k8s-bootstrap container starts but /etc/kubernetes directory never appears.
    • Control plane never initializes despite standalone-mode = true.
    • Kubelet service runs but waits indefinitely for configuration.
    • Secret Manager shows cluster info stuck in “pending.”
  3. Current settings in user-data TOML:
[settings.kubernetes]
api-server = "https://${hostname}:6443"
cluster-name = "${cluster_name}"
cluster-domain = "cluster.local"
authentication-mode = "tls"
standalone-mode = true
cluster-certificate = "${cluster_ca_cert}"
image-gc-high-threshold-percent = 85
image-gc-low-threshold-percent = 80

[settings.kubernetes.eviction-hard]
"memory.available" = "15%"
"nodefs.available" = "10%"
"nodefs.inodesFree" = "5%"

[settings.kubernetes.node-taints]
dedicated = ["experimental:PreferNoSchedule", "experimental:NoExecute"]
special = ["true:NoSchedule"]

[settings.network]
hostname = "${hostname}"

[settings.host-containers.control]
enabled = true
superpowered = true

[settings.host-containers.k8s-bootstrap]
enabled = true
superpowered = true
source = "${setup_image}"
user-data = "${bootstrap_userdata}"

[settings.ntp]
time-servers = ["169.254.169.123", "time.aws.com"]

[settings.kernel]
lockdown = "integrity"

[settings.kernel.sysctl]
"net.ipv4.conf.all.forwarding" = "1"
"net.ipv4.ip_forward" = "1"
"net.bridge.bridge-nf-call-iptables" = "1"
"net.ipv4.conf.all.send_redirects" = "0"
"net.ipv4.conf.all.accept_redirects" = "0"

[settings.container-runtime]
max-container-log-line-size = 20971520
enable-unprivileged-icmp = true
enable-unprivileged-ports = true 
  1. Repository with my Terraform and bootstrap code: https://github.com/caiolombello/KubeRocket

Is this expected behavior for the aws-k8s variant? Are there any undocumented requirements or configuration flags needed to enable control-plane static pods on AWS Bottlerocket?

Thank you for any guidance or pointers!

caiolombello avatar May 10 '25 01:05 caiolombello

Hi! @caiolombello We are running a control plane on Bottlerocket bare metal, but this is not for the faint of heart as the bootstrapping of the components requires you to produce and secure several certificate authorities. This is done for you, in a way that I would not recommend for any security sensitive setup in projects like kubeadm.

There isn't really a project that solves the security problem of standing up a kubernetes control plane in a turnkey way (as far as I am aware).

The standalone mode allows the kubelet to start without a control plane, so to get your design working, I would recommend disabling the standalone mode and rely on the retries of the kubelet to connect to the control plane once it becomes available.

The last issue with your current idea and setup is that you need to have the IP of the control plane before you initialize the node and you cannot configure localhost as this would make it a single-node cluster. Since you want this to work on AWS, I guess you could stand up an ALB in front of the control plane for this and maybe grab the IP in terraform before you create the nodes and provision the configuration for them?

The big caveat, I would say, is setting up a control plane this way becomes orthogonal to the security posture that Bottlerocket in general allows you to keep, as you create a relatively large vulnerability in your setup with bootstrapping a cluster with a certificate that lives inside the cluster. This means that any vulnerability in the kubernetes components would potentially allow for material exfiltration and a complete takeover of the entire cluster.

mikn avatar May 27 '25 09:05 mikn

@caiolombello minor correction to this:

The “metal-k8s-*” variant supports standalone mode + static pods but requires bare-metal instance types.

Static pods and standalone mode are supported on all the k8s variants - *-k8s-*. Also, confusingly, aws-k8s-* works on EC2 bare metal instances, while metal-k8s-* (end-of-life) was for non-cloud / non-provider-specific deployments.

You might find the bottlerocket-bootstrap tool from EKS Anywhere interesting. It's the node side of the logic to bootstrap Bottlerocket nodes into a cluster, starting from standalone mode and static pods. It's meant for use with Cluster API but could possibly be adapted to your preferred orchestration tool.

bcressey avatar Jun 03 '25 15:06 bcressey