kind when running kind in a container with the docker socket mounted: kind create cluster fails to remove control plane taint

What happened:

I tried to create a cluster with kind create cluster and received the error "failed to remove control plane taint"

What you expected to happen:

Successfully creating a cluster

How to reproduce it (as minimally and precisely as possible):

Install kind version v0.14.0-arm64 and call kind create cluster

Anything else we need to know?:

I'm running kind inside a container with the hosts (MacOS M1 Max) docker socket mounted and I'm able to run other containers with docker run.

Logs:

$ kind create cluster --loglevel=debug
WARNING: --loglevel is deprecated, please switch to -v and -q!
Creating cluster "kind" ...
DEBUG: docker/images.go:58] Image: kindest/node:v1.24.0@sha256:0866296e693efe1fed79d5e6c7af8df71fc73ae45e3679af05342239cdc5bc8e present locally
 ✓ Ensuring node image (kindest/node:v1.24.0) 🖼
 ✓ Preparing nodes 📦  
DEBUG: config/config.go:96] Using the following kubeadm config for node kind-control-plane:
apiServer:
  certSANs:
  - localhost
  - 127.0.0.1
  extraArgs:
    runtime-config: ""
apiVersion: kubeadm.k8s.io/v1beta3
clusterName: kind
controlPlaneEndpoint: kind-control-plane:6443
controllerManager:
  extraArgs:
    enable-hostpath-provisioner: "true"
kind: ClusterConfiguration
kubernetesVersion: v1.24.0
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/16
scheduler:
  extraArgs: null
---
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- token: abcdef.0123456789abcdef
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.19.0.2
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    node-ip: 172.19.0.2
    node-labels: ""
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubeadm.k8s.io/v1beta3
controlPlane:
  localAPIEndpoint:
    advertiseAddress: 172.19.0.2
    bindPort: 6443
discovery:
  bootstrapToken:
    apiServerEndpoint: kind-control-plane:6443
    token: abcdef.0123456789abcdef
    unsafeSkipCAVerification: true
kind: JoinConfiguration
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    node-ip: 172.19.0.2
    node-labels: ""
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
cgroupRoot: /kubelet
evictionHard:
  imagefs.available: 0%
  nodefs.available: 0%
  nodefs.inodesFree: 0%
failSwapOn: false
imageGCHighThresholdPercent: 100
kind: KubeletConfiguration
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
conntrack:
  maxPerCore: 0
iptables:
  minSyncPeriod: 1s
kind: KubeProxyConfiguration
mode: iptables
 ✓ Writing configuration 📜 
DEBUG: kubeadminit/init.go:82] I0808 18:27:28.895581     126 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0808 18:27:28.896451     126 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.24.0
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0808 18:27:28.900057     126 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0808 18:27:29.115670     126 certs.go:522] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost] and IPs [10.96.0.1 172.19.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0808 18:27:29.338086     126 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0808 18:27:29.421219     126 certs.go:522] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0808 18:27:29.554232     126 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0808 18:27:29.615892     126 certs.go:522] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.19.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.19.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0808 18:27:30.083897     126 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
I0808 18:27:30.124183     126 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
I0808 18:27:30.254718     126 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0808 18:27:30.362542     126 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0808 18:27:30.463815     126 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0808 18:27:30.698207     126 kubelet.go:65] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0808 18:27:30.784998     126 manifests.go:99] [control-plane] getting StaticPodSpecs
I0808 18:27:30.785329     126 certs.go:522] validating certificate period for CA certificate
I0808 18:27:30.785397     126 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0808 18:27:30.785414     126 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0808 18:27:30.785417     126 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0808 18:27:30.785420     126 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0808 18:27:30.785424     126 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0808 18:27:30.786696     126 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I0808 18:27:30.786710     126 manifests.go:99] [control-plane] getting StaticPodSpecs
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0808 18:27:30.786809     126 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0808 18:27:30.786818     126 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0808 18:27:30.786821     126 manifests.go:125] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0808 18:27:30.786823     126 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0808 18:27:30.786826     126 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0808 18:27:30.786828     126 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0808 18:27:30.786830     126 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
I0808 18:27:30.787252     126 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I0808 18:27:30.787273     126 manifests.go:99] [control-plane] getting StaticPodSpecs
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0808 18:27:30.787392     126 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0808 18:27:30.787617     126 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0808 18:27:30.787952     126 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0808 18:27:30.787989     126 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
I0808 18:27:30.788334     126 loader.go:372] Config loaded from file:  /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0808 18:27:30.790085     126 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s  in 0 milliseconds
 ✗ Starting control-plane 🕹️ 
ERROR: failed to create cluster: failed to remove control plane taint: command "docker exec --privileged kind-control-plane kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes --all node-role.kubernetes.io/control-plane- node-role.kubernetes.io/master-" failed with error: exit status 1
Command Output: The connection to the server kind-control-plane:6443 was refused - did you specify the right host or port?
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
        sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
        sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
        sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit.(*action).Execute
        sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit/init.go:140
sigs.k8s.io/kind/pkg/cluster/internal/create.Cluster
        sigs.k8s.io/kind/pkg/cluster/internal/create/create.go:135
sigs.k8s.io/kind/pkg/cluster.(*Provider).Create
        sigs.k8s.io/kind/pkg/cluster/provider.go:182
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.runE
        sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:80
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.NewCommand.func1
        sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:55
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/[email protected]/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/[email protected]/command.go:974
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/[email protected]/command.go:902
sigs.k8s.io/kind/cmd/kind/app.Run
        sigs.k8s.io/kind/cmd/kind/app/main.go:53
sigs.k8s.io/kind/cmd/kind/app.Main
        sigs.k8s.io/kind/cmd/kind/app/main.go:35
main.main
        sigs.k8s.io/kind/main.go:25
runtime.main
        runtime/proc.go:250
runtime.goexit
        runtime/asm_arm64.s:1263

Environment:

kind version: (use kind version): kind v0.14.0 go1.18.2 linux/arm64
Kubernetes version: (use kubectl version): Client Version: v1.24.3 Kustomize Version: v4.5.4
Docker version: (use docker info):

Client:
  Context:    default
  Debug Mode: false
  Plugins:
    buildx: Docker Buildx (Docker Inc., 0.8.2+azure-1)
    compose: Docker Compose (Docker Inc., 2.9.0+azure-1)

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 4
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.104-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 5
 Total Memory: 14.62GiB
 Name: docker-desktop
 ID: DWAP:AOR6:N5DU:HCAK:GC35:RRZ6:4YMP:4JVL:UJ66:GKCY:N6RR:VAAL
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5000
  127.0.0.0/8
 Live Restore Enabled: false

OS (e.g. from /etc/os-release):

PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Aug 08 '22 18:08 fasmat

Can you share the logs folder from kind create cluster --retain; kind export logs; kind delete cluster?

Just to confirm: Everything is running in arm64 mode, no amd64 images / binaries? (See: https://github.com/kubernetes-sigs/kind/issues/2718 which has been a somewhat common issue for M1 users)

Aug 08 '22 18:08 BenTheElder

Maybe to clarify: running kind create cluster on the Mac directly works without issues.

I'm trying to run the command from inside a container, that has access to the docker-socket of the host. This container is also built for arm64 and can run other containers without issues.

Alternatively if I create the cluster on the host, how can I access it from inside a container? Is there some way to access kind-control-plane with kubectl from inside my container, e.g. by having both of them inside the same docker network?

Aug 08 '22 18:08 fasmat

I switched from docker-from-docker to a docker-in-docker setup.

This is probably less performant, but it works for now. Closing the issue, since I probably tried something that is not officially supported.

Aug 08 '22 19:08 fasmat

Alternatively if I create the cluster on the host, how can I access it from inside a container? Is there some way to access kind-control-plane with kubectl from inside my container, e.g. by having both of them inside the same docker network?

you can put the other container on the “kind” docker network either with the net flag or with “docker network connect”

then you can use “kind export kubeconfig —internal”

Aug 08 '22 19:08 BenTheElder

you can put the other container on the “kind” docker network either with the net flag or with “docker network connect”

then you can use “kind export kubeconfig —internal”

Thanks! I'll try that and compare performance.

Aug 08 '22 19:08 fasmat

It’s weird that this is working from the host but not via docker socket in a container. I’m not sure why we’d see that. Maybe proxy config differences?

Aug 08 '22 19:08 BenTheElder

I'm not sure. I need to test more to find out. I had issues with volumes/mounts before when using the hosts docker socket. I had to give the path how it is on the host instead of how it would be inside the container.

Aug 08 '22 19:08 fasmat

I am also having this issue on my m1 macbook air when using a docker-from-docker setup

Aug 25 '22 20:08 luctowers

~~Interestingly, docker network connect kind $HOSTNAME does allow me to curl the control plane via https://kind-control-plane:6443.~~

~~Despite this I still get this error when starting control plane via kind create cluster.~~

ERROR: failed to create cluster: failed to remove control plane taint: command "docker exec --privileged kind-control-plane kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes --all node-role.kubernetes.io/control-plane- node-role.kubernetes.io/master-" failed with error: exit status 1
Command Output: The connection to the server kind-control-plane:6443 was refused - did you specify the right host or port?

Nevermind, it's a problem with the name resolution from within the control-plane itself, not the caller of kind create cluster.

Aug 25 '22 20:08 luctowers

After some investigating. The problem is, the name resolution in the control-plane doesn't work immediately when the container is started. Adding a sleep before running the remove taint command works as a hacky fix.

diff --git a/pkg/cluster/internal/create/actions/kubeadminit/init.go b/pkg/cluster/internal/create/actions/kubeadminit/init.go
index cc587940..e9778ce9 100644
--- a/pkg/cluster/internal/create/actions/kubeadminit/init.go
+++ b/pkg/cluster/internal/create/actions/kubeadminit/init.go
@@ -19,6 +19,7 @@ package kubeadminit
 
 import (
        "strings"
+       "time"
 
        "sigs.k8s.io/kind/pkg/errors"
        "sigs.k8s.io/kind/pkg/exec"
@@ -135,6 +136,7 @@ func (a *action) Execute(ctx *actions.ActionContext) error {
                taintArgs := []string{"--kubeconfig=/etc/kubernetes/admin.conf", "taint", "nodes", "--all"}
                taintArgs = append(taintArgs, taints...)
 
+               time.Sleep(5 * time.Second)
                if err := node.Command(
                        "kubectl", taintArgs...,
                ).Run(); err != nil {

IDK why this is the case with our environments. We are both running docker-from-docker on Apple M1, not sure how much is coincidence.

Thoughts @BenTheElder ?

Aug 25 '22 22:08 luctowers

I was also able to reproduce this issue with a docker-from-docker setup on x86_64 Ubuntu.

Aug 26 '22 22:08 luctowers

The problem is, the name resolution in the control-plane doesn't work immediately when the container is started.

Uh, that shouldn't be the case. Sounds like a docker bug?

Can you explain what you mean by "docker-from-docker", exactly? Is that like docker-in-docker (docker running inside of a docker container) or docker with the socket mounted to a container?

Aug 26 '22 23:08 BenTheElder

Can you explain what you mean by "docker-from-docker", exactly? Is that like docker-in-docker (docker running inside of a docker container) or docker with the socket mounted to a container?

Just a minimal setup mounting /var/run/docker.sock from the host into the container.

Aug 26 '22 23:08 luctowers

Uh, that shouldn't be the case. Sounds like a docker bug?

Maybe... I tested with older versions of kind, and they have the same issue. Seems odd that we are only just finding it now, unless something got broken in docker.

Aug 26 '22 23:08 luctowers

Sorry, too many things going on 😅

Does this happen if you use the docker client from the host, running on the host, instead?

It sounds like this environment is broken, and I'd rather not add a sleep to hack around it (I mean it could take longer elsewhere as well)

but we could consider, for example, using loopback instead. But I'd like to know more about why this is failing, something is off with the network in this setup.

Sep 09 '22 21:09 BenTheElder

Does this happen if you use the docker client from the host, running on the host, instead?

No it does not. It may be something to do with the added latency or overhead when mounting the socket somehow? I agree, the sleep is a bad hacky solution. I think using the loopback addr is likely best.

Sep 13 '22 01:09 luctowers

Facing the same issue here - it seems highly error prone. I've observed that it seems to start working randomly after several retries.

I also don't see how the socket should be an issue here, as the error is DNS resolution from within the container 🤔

Note that in my case, I've been using the Kind go library, rather than the kind binary to perform these operations. I'm not sure if that is the case for others in this issue, and if so, that the command in the kind binary has some built-in retry mechanism? Seems like an obscure problem that's hard to debug for sure.

An additional note on my setup is that I've been calling docker over TCP rather than a unix socket (running socat on local to proxy to the socket).

Sep 20 '22 15:09 phroggyy

I also don't see how the socket should be an issue here, as the error is DNS resolution from within the container 🤔

So the DNS resolution in the container comes from a (different) socket docker embeds.

See the note about custom networks in:

https://docs.docker.com/config/containers/container-networking/#dns-services

If you're using kind via a container that mounts the docker socket, it's possible docker behaves differently here (?) Unfortunately I haven't had time to dig into this myself yet.

This is not really a use case we've been focused on, kind is a statically linked go binary meant to be run on the host.

Rather than the kind binary to perform these operations. I'm not sure if that is the case for others in this issue, and if so, that the command in the kind binary has some built-in retry mechanism?

No the CLI code is a very small CLI wrapper over the public APIs, there's no special retry logic.

An additional note on my setup is that I've been calling docker over TCP rather than a unix socket (running socat on local to proxy to the socket).

FYI this is unfortunately also known to have issues, off the top of my head there's no way for kind to reserve a random TCP port for the api server reliably. We do permit setting an explicit 0 port and let docker pick instead, but then on restart docker will assign another port. (don't have this handy but past discussion in the issue tracker)

Sep 20 '22 18:09 BenTheElder

@BenTheElder to clarify, I run a docker socket on my local and use socat to specifically expose on :2375 (I wanted to minimise messing with the docker desktop configuration). So I don't think the TCP part should cause further issues.

Do you think it would make sense to add some retry mechanism to this taint call? What's happening now is that the whole creation gets rolled back after the cluster is already created, effectively due to networking. I'm wondering if it might make sense to add an exponential backoff retry (with a fairly low max) to make this more reliable. As you said, kind isn't really built for this usecase, so I'd also get if you don't want to add maintenance burden for an edge case. Thoughts?

Also, just to clarify what's happening: the error here isn't coming from a call to the docker daemon. Rather, the error we're getting is coming from the kubectl call running inside the new cluster (on the node). This should, at least from my limited understanding, work the same regardless of if the docker call is made locally or from a mounted socket (or over the network), since our error is happening inside the created container. Or put differently, it's not docker exec --privileged that's failing, but rather kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes ... within the container (and it's executed within that container regardless of how you called docker, I'd assume).

Sep 25 '22 12:09 phroggyy

Do you think it would make sense to add some retry mechanism to this taint call?

Tentatively, that's just patching over one particular symptom of the networking being broken.

What's happening now is that the whole creation gets rolled back after the cluster is already created, effectively due to networking.

Well yes, we can't very well run a functional cluster with broken networking.

I'm wondering if it might make sense to add an exponential backoff retry (with a fairly low max) to make this more reliable.

I don't think that's reasonable after the API server is up, this is a local API call executed itself on one of the control plane nodes itself. We already have an exponential retry waiting for the api server to be ready in kubeadm. Perhaps one retry, but again, this should not flake, it should be a very cheap local call, if it's failing, it's a symptom of the cluster being in a bad state of some sort.

I suggested a possible solution above, but I'd like to understand what / why this is actually broken before I jump on making any changes.

There's no reason resolving the container names should fail, docker is responsible for this and I've

Or put differently, it's not docker exec --privileged that's failing, but rather kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes ... within the container (and it's executed within that container regardless of how you called docker, I'd assume).

Yes, but it only seems to be failing when the docker socket is mounted when using kind, and it seems to be related to DNS issues, which makes me think mounting the docker socket when creating the cluster is leading to somewhat broken DNS in the cluster, which doesn't make sense given my understanding of how docker implements DNS, but none of this makes sense ... The dns response for the node name should be local from docker and should be quick and reliable ™️

So far we've had no reports of this with standard local docker socket without containerizing kind itself of using docker over TCP, though I can't fathom why those are relevant.

Unfortunately, without a way to replicate this, I'm reliant on you all to identify why docker containers are not reliably able to resolve themselves or what else is making this call fail.

Sep 25 '22 21:09 BenTheElder

I'm running into the exact same problem in a "docker-outside-docker" setup (i.e., bind-mounting /var/run/docker.sock into a container, then trying to run kind in that container):

$ kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.32.2) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✗ Starting control-plane 🕹️
Deleted nodes: ["kind-control-plane"]
ERROR: failed to create cluster: failed to remove control plane taint: command "docker exec --privileged kind-control-plane kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes --all node-role.kubernetes.io/control-plane-" failed with error: exit status 1
Command Output: E0501 15:15:52.848184     280 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://kind-control-plane:6443/api?timeout=32s\": dial tcp 172.18.0.3:6443: connect: connection refused"
E0501 15:15:52.849828     280 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://kind-control-plane:6443/api?timeout=32s\": dial tcp 172.18.0.3:6443: connect: connection refused"
The connection to the server kind-control-plane:6443 was refused - did you specify the right host or port?

Please note that the problem is not DNS resolution, but failure to connect (in my case like in the original issue, the error message is The connection to the server kind-control-plane:6443 was refused).

If I kind create cluster --retain, then kind get kubeconfig, then tweak the kubeconfig file, I can get it to work (but the setup of the cluster isn't completed since there are probably steps that should happen after removing the control plane taint; like setting up kubenet, maybe?)

It looks like whatever logic is supposed to wait for the control plane to come up doesn't work correctly when running in "docker-outside-docker". But why? It is a great mystery, that I may or may not end up being able to solve 😅

I'm not expecting a reply from maintainers; I'm just leaving a note here to indicate that the issue still exists, and clarify that it's not DNS. (Maybe it's not DNS this time :))

May 01 '25 15:05 jpetazzo

It looks like whatever logic is supposed to wait for the control plane to come up doesn't work correctly when running in "docker-outside-docker". But why? It is a great mystery, that I may or may not end up being able to solve 😅

@jpetazzo can you upload the kind export logs from that cluster?

May 02 '25 15:05 aojea

Sure; here it is:

kind-export-logs.tgz

(I'm sorry, I thought I'd be able to analyze the logs myself, but I see many different files and I don't know exactly what they correspond to; so I'm providing the whole thing!)

May 07 '25 15:05 jpetazzo

Sharing some observations

etcd start

2025-05-07T15:17:30.129349569Z stderr F {"level":"warn","ts":"2025-05-07T15:17:30.129187Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."}

etcd ready

2025-05-07T15:17:30.129349569Z stderr F {"level":"warn","ts":"2025-05-07T15:17:30.129187Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."}

apiserver start earlier

2025-05-07T15:17:29.748377007Z stderr F W0507 15:17:29.748090 1 registry.go:256] calling componentGlobalsRegistry.AddFlags more than once, the registry will be set by the latest flags

apiserver fails to connect to etcd

2025-05-07T15:17:30.066501719Z stderr F W0507 15:17:30.066423 1 logging.go:55] [core] [Channel #5 SubChannel #6]grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused"

containerd starts etcd pod at

May 07 15:17:29 kind-control-plane containerd[111]: time="2025-05-07T15:17:29.154047822Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-kind-control-plane,Uid:3390d474380a154813a08ce64c49a997,Namespace:kube-system,Attempt:0,}"

and apiserver at

May 07 15:17:29 kind-control-plane containerd[111]: time="2025-05-07T15:17:29.157574508Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-kind-control-plane,Uid:bc76fff9173ce3f33afc494e8ee291b4,Namespace:kube-system,Attempt:0,}"

I think pods should be restarted later and get the cluster working

May 09 '25 12:05 aojea

Oh, like I mentioned in the earlier comment - the cluster works:

If I kind create cluster --retain, then kind get kubeconfig, then tweak the kubeconfig file, I can get it to work

I just have no idea why kind doesn't wait for the control plane to be up before trying to remove the taint.

Interestingly, in the same container, if I use docker-in-docker instead of docker-outside-docker (=bind-mount of the Docker socket), kind create cluster works perfectly fine. Much mystery! :-)

Thanks for looking at this though, I appreciate the insights! 🙏🏻

May 09 '25 14:05 jpetazzo

I just have no idea why kind doesn't wait for the control plane to be up before trying to remove the taint.

kubeadm isn't supposed to return success before apiserver is ready. it internally needs to interact with the api, so by the time we're trying to remove the taint there have already been a few other API requests from kubeadm (e.g. to install the kube-proxy daemonset, to upload some configmaps, etc https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#synopsis)

May 09 '25 21:05 BenTheElder

Ohhh, you're right, thanks for pointing that out!

I cranked up verbosity to 99999, and when running kind locally, I get:

[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 502.099964ms

While running it with docker-outside-docker gives me:

[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
 ✗ Starting control-plane 🕹️
Deleted nodes: ["kind-control-plane"]
ERROR: failed to create cluster: failed to remove control plane taint: command "docker exec --privileged kind-control-plane kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes --all node-role.kubernetes.io/control-plane-" failed with error: exit status 1

So, it looks like kubeadm gets kind of aborted, but in a way that doesn't get detected by kind. Weird!

Apparently, kind is executing this command to run kubeadm (at least, if I'm to believe to output of ps faux while kind create cluster is running):

docker exec --privileged kind-control-plane kubeadm init --config=/kind/kubeadm.conf --skip-token-print --v=6

I tried to execute that command manually, with a few variations.

And lo and behold: in our particular scenario, kubeadm init seems to fail when stdin is closed. In other words:

# This works
docker exec --privileged -i kind-control-plane kubeadm init --config=/kind/kubeadm.conf --skip-token-print --v=6
# This does not
docker exec --privileged    kind-control-plane kubeadm init --config=/kind/kubeadm.conf --skip-token-print --v=6

(At first I thought we were dealing with controlling terminal issues and had reminiscence of Docker issue 1422, but this seems a bit different :))

I had a quick glance at the kubeadm code (around here I suppose) but didn't see anything immediately obvious; but I didn't walk that function call tree up and down to see if there could be anything relying on stdin around it. I also don't know why this would behave differently between "raw docker" and "docker-outside-of-docker" 😅

This feels like incremental progress. Maybe. :)

May 10 '25 08:05 jpetazzo

I think I've managed to isolate from kind (I still think there's a bug in the way kind is running kubeadm though).

I'm on MacOS, using Orbstack, and running the following in a vscode devcontainer.

If I do this in one terminal: docker run --rm -it ubuntu bash

And then docker exec running sleep command in another without "-i" then the command returns early:

time docker exec ubuntu sleep 5

real    0m0.542s
user    0m0.016s
sys     0m0.022s

Adding in "-i" lets sleep run for the 5 seconds:

time docker exec -i ubuntu sleep 5

real    0m5.072s
user    0m0.017s
sys     0m0.024s

If I then run on the host directly, we see sleep running for the full 5 seconds even without "-i":

time docker exec ubuntu sleep 5

real    0m5.100s
user    0m0.009s
sys     0m0.012s

So as @jpetazzo mentions, we're seeing a difference in the docker exec behaviour directly on the host vs. inside a container.

Jul 08 '25 15:07 chibbert

The above behaviour also occurs using a ubunut host, using standard docker-ce install of the docker engine, so the commonality appears to be docker exec behaviour.

Possible workaround add the equivalent of "-i" to the go code in kind?

Jul 10 '25 06:07 chibbert

Just tried a "-i" in node.go: func (c *nodeCmd) Run()

But that doesn't help. This feels like something to do with stdin connections into docker exec

Jul 10 '25 07:07 chibbert