kind
kind copied to clipboard
"x509: certificate signed by unknown authority" on a fresh new kind cluster running alongside minikube --vm-driver=none
What happened:
$ curl -Lo ~/bin/kind "https://kind.sigs.k8s.io/dl/v0.11.1/kind-$(uname)-amd64"
$ chmod +x ~/bin/kind
$ kind create cluster
Creating cluster "kind" ...
β Ensuring node image (kindest/node:v1.21.1) πΌ
β Preparing nodes π¦
β Writing configuration π
β Starting control-plane πΉοΈ
β Installing CNI π
β Installing StorageClass πΎ
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community π
$ kubectl --context=kind-kind get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-b8qs8 0/1 Running 0 17m
kube-system coredns-558bd4d5db-tfn48 0/1 Running 0 17m
kube-system etcd-kind-control-plane 1/1 Running 0 18m
kube-system kindnet-bv86g 1/1 Running 0 17m
kube-system kube-apiserver-kind-control-plane 1/1 Running 0 18m
kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 18m
kube-system kube-proxy-btkd2 1/1 Running 0 17m
kube-system kube-scheduler-kind-control-plane 1/1 Running 0 18m
local-path-storage local-path-provisioner-547f784dff-57n2l 0/1 CrashLoopBackOff 8 17m
$ kubectl --context=kind-kind logs -n local-path-storage local-path-provisioner-547f784dff-57n2l
time="2021-10-26T12:53:43Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get https://10.96.0.1:443/version?timeout=32s: x509: certificate signed by unknown authority"
$ kubectl --context=kind-kind logs -n kube-system coredns-558bd4d5db-b8qs8
...
E1026 12:55:58.167556 1 reflector.go:127] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:156: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get "https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
E1026 12:56:03.319519 1 reflector.go:127] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:156: Failed to watch *v1.Namespace: failed to list *v1.Namespace: Get "https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
E1026 12:56:05.587305 1 reflector.go:127] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:156: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
I checked issues already reported: https://github.com/kubernetes-sigs/kind/issues?q=is%3Aissue+%22x509%3A+certificate+signed+by+unknown+authority%22 but this seem different :/ Also FYI: I am running a bare-metal minikube on the same machine (vm-driver=none).
Environment:
$ kind version
kind v0.11.1 go1.16.4 linux/amd64
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"archive", BuildDate:"2021-06-13T07:08:18Z", GoVersion:"go1.15.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
Server:
Containers: 174
Running: 73
Paused: 0
Stopped: 101
Images: 40
Server Version: 20.10.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc nvidia nvidia-experimental
Default Runtime: nvidia
Init Binary: docker-init
containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.46-5rodete1-amd64
Operating System: Debian GNU/Linux rodete
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 62.81GiB
Name: ensonic.muc.corp.google.com
ID: FE3G:TCOF:UPXI:K3OY:S7JL:ZTOK:H7Q4:YQS2:AKT4:IVWN:NUDI:54ZE
Docker Root Dir: /usr/local/google/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://mirror.gcr.io/
Live Restore Enabled: false
two things I see in the logs (kind export logs
):
# kind-control-plane/journal.log
Oct 26 13:16:18 kind-control-plane containerd[177]: time="2021-10-26T13:16:18.873978746Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: c
ni plugin not initialized: failed to load cni config"
...
Oct 26 13:16:25 kind-control-plane kubelet[269]: E1026 13:16:25.870383 269 certificate_manager.go:437] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post "https://kind-control-plane:6443/apis/certificates.k8s.io/v1/cer
tificatesigningrequests": dial tcp [fc00:f853:ccd:e793::2]:6443: connect: connection refused
Oct 26 13:16:28 kind-control-plane kubelet[269]: E1026 13:16:28.023860 269 certificate_manager.go:437] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Post "https://kind-control-plane:6443/apis/certificates.k8s.io/v1/cer
tificatesigningrequests": dial tcp [fc00:f853:ccd:e793::2]:6443: connect: connection refused
Is the /etc/cni/net.d
part of the node image? Asking since I also have the directory on the host (for the bare-metal minkube).
Also when I look at:
# kube-system/kube-scheduler-kind-control-plane:kube-scheduler
I1026 13:16:34.423368 1 serving.go:347] Generated self-signed cert in-memory
W1026 13:16:40.724458 1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccou
W1026 13:16:40.724501 1 authentication.go:337] Error looking up in-cluster authentication configuration: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot get resource "configmaps" in API group "" in the namespace "kube-syste
W1026 13:16:40.724514 1 authentication.go:338] Continuing without authentication configuration. This may treat all requests as anonymous.
W1026 13:16:40.724525 1 authentication.go:339] To require authentication configuration lookup to succeed, set --authentication-tolerate-lookup-failure=false
I1026 13:16:40.755914 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1026 13:16:40.755958 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1026 13:16:40.756242 1 secure_serving.go:197] Serving securely on 127.0.0.1:10259
I1026 13:16:40.756320 1 tlsconfig.go:240] Starting DynamicServingCertificateController
E1026 13:16:40.757975 1 reflector.go:138] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler
E1026 13:16:40.758489 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolum
E1026 13:16:40.758545 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcont
E1026 13:16:40.758663 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
...
Any idea about this?
what is the output of docker ps
on that host?
can you try to create a cluster with a different name kind create cluster --name testkind
?
Same if I name it testkind:
kubectl --context=kind-testkind get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-hcrbt 0/1 Running 0 4m34s
kube-system coredns-558bd4d5db-w8g6m 0/1 Running 0 4m34s
kube-system etcd-testkind-control-plane 1/1 Running 0 4m44s
kube-system kindnet-dcn6g 1/1 Running 0 4m34s
kube-system kube-apiserver-testkind-control-plane 1/1 Running 0 4m44s
kube-system kube-controller-manager-testkind-control-plane 1/1 Running 0 4m52s
kube-system kube-proxy-ptlvk 1/1 Running 0 4m34s
kube-system kube-scheduler-testkind-control-plane 1/1 Running 0 4m51s
local-path-storage local-path-provisioner-547f784dff-9tgsg 0/1 CrashLoopBackOff 5 4m34s
docker ps | grep kindest
5eb142bfad98 kindest/node:v1.21.1 "/usr/local/bin/entrβ¦" About a minute ago Up About a minute 127.0.0.1:37423->6443/tcp kind-control-plane
since I have minikube running too, a full docker ps would list another 66 containers.
I can't understand this honestly, unless dns is messed, can you verify that the host matches the ip of he node?
Also FYI: I am running a bare-metal minikube on the same machine (vm-driver=none).
... is there a reason for this? I don't know what all bare-metal minikube does these days, but I would not be surprised if it's related.
Does it also fail in this environment if you clean up the bare metal minikube first?
I run kind clusters on rodete all the time (hi googler!) and at the moment rodete should be fine (in the past we've had fun with things like cgroupsv2 breaking the available docker version and preventing older k8s releases from working π )
certs are handled by kubeadm and aren't anything terribly special. I suspect the bare metal minikube networking is interfering here? Probably the lookup to the API server is conflicting with the bare metal cluster on that host.
Is the /etc/cni/net.d part of the node image? Asking since I also have the directory on the host (for the bare-metal minkube).
this is something that gets written out when the networking daemon (kindnetd) starts up, we run it as a daemonset (pretty typical for CNI implementations), it is read by containerd for creating pods that don't use hostnetwork (apiserver, the networking agent, kube-proxy all use host-network, and kubelet runs on the host).
vm-driver=none since we use k8s on appliances (need to access hw).
vm-driver=none since we use k8s on appliances (need to access hw).
er but why this and kind? (also you can use extraMounts config to pass hw vfs to kind nodes).
I think the networking changes vm-driver=none is making is conflicting with kind, and I'm not sure this is super reasonable to debug and support?
I really don't recommend running vm-driver=none on your workstation.
I would guess more specifically running minikube vm-driver=none is screwing up the DNS resolution and we are actually reaching the wrong api-server.
We're using kind for tests and minikube for the developers.
Given Kubernetes is removing dockershim support in 1.24, that approach is probably going to become its own headache for other reasons ...
Still inclined to suggest that vm-driver=none is causing the networking issues here. I doubt that kubeadm is actually generating bad certs and docker is responsible for the node names resolving correctly ordinarily.
Can you mount through the devices you need to a containerized or VMfull cluster? vm-driver=none takes over the host environment in various ways, running kubeadm init in a developer environment is generally avoided (e.g. the kubeadm team use "kinder" an extension of this project).