v0.8.25 has broken cluster config with containerdConfigPatches
The following cluster-config.yaml file works on v0.8.24 but fails in v0.8.25:
apiVersion: ctlptl.dev/v1alpha1
kind: Registry
name: ctlptl-registry
port: 5000
---
apiVersion: ctlptl.dev/v1alpha1
kind: Cluster
product: kind
registry: ctlptl-registry
name: kind-mycluster
kindV1Alpha4Cluster:
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."example.internal:5001"]
endpoint = ["http://example.internal:5001"]
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 5001
hostPort: 5001
protocol: TCP
listenAddress: "127.0.0.1"
Note that I have appended 127.0.0.1 example.internal to my /etc/hosts file.
The complexity in the configuration is because I am hosting another registry within the kind cluster and exposing it on port 5001.
For a minimal failing example:
apiVersion: ctlptl.dev/v1alpha1
kind: Cluster
product: kind
name: kind-mycluster
kindV1Alpha4Cluster:
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."example.internal:5001"]
endpoint = ["http://example.internal:5001"]
Running the command ctlptl apply -f cluster-config.yaml
The failure message is as follows:
registry.ctlptl.dev/ctlptl-registry created
No kind clusters found.
Creating cluster "mycluster" ...
â Ensuring node image (kindest/node:v1.28.0) đŧ
â Preparing nodes đĻ
â Writing configuration đ
â Starting control-plane đšī¸
Deleted nodes: ["mycluster-control-plane"]
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged mycluster-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I1130 13:28:20.858198 170 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W1130 13:28:20.859005 170 initconfiguration.go:336] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
I1130 13:28:20.864778 170 certs.go:112] creating a new certificate authority for ca
[init] Using Kubernetes version: v1.28.0
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
I1130 13:28:20.937181 170 certs.go:519] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost mycluster-control-plane] and IPs [10.96.0.1 172.19.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I1130 13:28:21.184603 170 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I1130 13:28:21.821000 170 certs.go:519] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I1130 13:28:22.105460 170 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I1130 13:28:22.310778 170 certs.go:519] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost mycluster-control-plane] and IPs [172.19.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost mycluster-control-plane] and IPs [172.19.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I1130 13:28:23.438336 170 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I1130 13:28:23.585848 170 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I1130 13:28:24.062711 170 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I1130 13:28:24.468577 170 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I1130 13:28:24.642536 170 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I1130 13:28:24.915956 170 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I1130 13:28:24.916004 170 manifests.go:102] [control-plane] getting StaticPodSpecs
I1130 13:28:24.916398 170 certs.go:519] validating certificate period for CA certificate
I1130 13:28:24.916446 170 manifests.go:128] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I1130 13:28:24.916450 170 manifests.go:128] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I1130 13:28:24.916452 170 manifests.go:128] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I1130 13:28:24.916454 170 manifests.go:128] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I1130 13:28:24.916457 170 manifests.go:128] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I1130 13:28:24.916826 170 manifests.go:157] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I1130 13:28:24.916840 170 manifests.go:102] [control-plane] getting StaticPodSpecs
I1130 13:28:24.916926 170 manifests.go:128] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I1130 13:28:24.916929 170 manifests.go:128] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I1130 13:28:24.916931 170 manifests.go:128] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I1130 13:28:24.916935 170 manifests.go:128] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I1130 13:28:24.916938 170 manifests.go:128] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I1130 13:28:24.916940 170 manifests.go:128] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I1130 13:28:24.916942 170 manifests.go:128] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I1130 13:28:24.917963 170 manifests.go:157] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I1130 13:28:24.917989 170 manifests.go:102] [control-plane] getting StaticPodSpecs
[control-plane] Creating static Pod manifest for "kube-scheduler"
I1130 13:28:24.918096 170 manifests.go:128] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I1130 13:28:24.918368 170 manifests.go:157] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
I1130 13:28:24.918391 170 kubelet.go:67] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
I1130 13:28:25.086822 170 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
I1130 13:28:25.088822 170 loader.go:395] Config loaded from file: /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I1130 13:28:25.101563 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 3 milliseconds
<multiple repetitions removed for brevity>
I1130 13:29:04.604864 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1130 13:29:05.104074 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
I1130 13:29:09.603468 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1130 13:29:10.105657 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
<multiple repetitions removed for brevity>
I1130 13:29:20.107134 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1130 13:29:20.608924 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
<multiple repetitions removed for brevity>
I1130 13:29:40.108681 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1130 13:29:40.605108 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
<multiple repetitions removed for brevity>
I1130 13:30:20.105067 170 round_trippers.go:553] GET https://mycluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:111
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
creating kind cluster: exit status 1
The machine on which this is failing is an Apple M2 with macOS Sonoma version 14.1.1.
hmmmm...i'm not sure how to resolve this.
The problem is that kind v0.20.0 has deprecated containerd CRI mirrors, see https://github.com/kubernetes-sigs/kind/releases/tag/v0.20.0. It currently has a compatibility shim but it doesn't work in all cases. And the plan is that this will break upstream soon.
i filed a feature request here for a better api for this - https://github.com/kubernetes-sigs/kind/issues/3354 - but the root of the issue isn't even Kind, it's upstream containerd deprecations.
Thanks for this @nicks - this is incredibly useful information, as I wasn't aware of the changes.
I've attempted to make the necessary changes, but it seems that insecure registries aren't supported (or I've made some misconfiguration): https://github.com/containerd/containerd/discussions/9454
I resolved this by upgrading to v0.8.28 and making the following amendments:
In the cluster-config.yaml:
apiVersion: ctlptl.dev/v1alpha1
kind: Cluster
product: kind
name: kind-mycluster
kindV1Alpha4Cluster:
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
And introducing a file at the following path, /etc/containerd/certs.d/example.internal:5001/hosts.toml:
server = "http://example.internal:5001"
[host."http://example.internal:5001"]
capabilities = ["pull", "resolve", "push"]
skip_verify = true
An update for anyone reading this, it can now more simply be solved by using registryAuths, an example is here: https://github.com/tilt-dev/ctlptl/blob/main/examples/kind_registry_auth.yaml