ignite
ignite copied to clipboard
DNS misconfiguration in ignite image
Context
- Ignite: v0.6.3
- 18.04: GCP custom image via
gcloud compute images create nested-virt \
--source-image-project=ubuntu-os-cloud \
--source-image-family=ubuntu-1804-lts \
--licenses="https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx"
notice that I changed 1604-lts --> 1804-lts
- weaveworks/ignite-kubeadm:latest-v0.6.3
Steps
- install Ignite: v0.6.3
- Run Test VM ignite run weaveworks/ignite-kubeadm:latest-v0.6.3 \ --cpus 2 \ --memory 8GB \ --ssh \ --name vm
- ignite ssh vm
- Test for Empty /etc/resolv.conf
[ -s /etc/resolv.conf ] || echo '/etc/resolv.conf is empty'
Similar issues #213
I was trying to Run kubeadm in HA mode with Ignite VMs by the way. As per this tutorial https://github.com/weaveworks/ignite/tree/master/images/kubeadm. The kubelet failed health checks even though it's clearly running:
$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sat 2020-01-25 15:21:18 UTC; 6min ago
Docs: https://kubernetes.io/docs/home/
Main PID: 10957 (kubelet)
Tasks: 16 (limit: 4915)
CGroup: /system.slice/kubelet.service
└─10957 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml
--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --resolv-conf=/run/systemd/resolve/resolv.conf
Jan 25 15:27:24 localhost.localdomain kubelet[10957]: E0125 15:27:24.714219 10957 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.Runtim
eClass: Get https://firekube.luxas.dev:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: connect: no route to host
Jan 25 15:27:24 localhost.localdomain kubelet[10957]: E0125 15:27:24.714695 10957 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.
Pod: Get https://firekube.luxas.dev:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost.localdomain&limit=500&resourceVersion=0: dial tcp 172.17.0.2:6443: connect: no rou
te to host
Jan 25 15:27:24 localhost.localdomain kubelet[10957]: E0125 15:27:24.718970 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:24 localhost.localdomain kubelet[10957]: E0125 15:27:24.819359 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:24 localhost.localdomain kubelet[10957]: E0125 15:27:24.920169 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:25 localhost.localdomain kubelet[10957]: E0125 15:27:25.020418 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:25 localhost.localdomain kubelet[10957]: E0125 15:27:25.120828 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:25 localhost.localdomain kubelet[10957]: E0125 15:27:25.221174 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:25 localhost.localdomain kubelet[10957]: E0125 15:27:25.321544 10957 kubelet.go:2248] node "localhost.localdomain" not found
Jan 25 15:27:25 localhost.localdomain kubelet[10957]: E0125 15:27:25.421862 10957 kubelet.go:2248] node "localhost.localdomain" not found
@rugwirobaker, What exactly is the misconfiguration?
Is your vm missing an /etc/resolv.conf
like this command implies?
- Test for Empty /etc/resolv.conf
[ -s /etc/resolv.conf ] || echo '/etc/resolv.conf is empty'
Do you know if you're using systemd-resolved on the GCP host vm? We have some code in ignite that detects this, and I'm curious what your host configuration is.
Can you show the contents of these files on the host?
/etc/resolv.conf
/run/systemd/resolve/resolv.conf
Thank you for getting back to me.
$ cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "systemd-resolve --status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 127.0.0.53
options edns0
search us-east1-b.c.k8s-300.internal c.k8s-300.internal google.internal
and
$ cat /run/systemd/resolve/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 169.xxx.xxx.xxx
search us-east1-b.c.k8s-300.internal c.k8s-300.internal google.internal
Note: the last part of the nameserver was a complete IP. I removed the last part in this snippet on purpose.
I'm guessing your host's nameserver is the GCP metadata server 169.254.169.254
Cloud DNS reference: https://cloud.google.com/dns/docs/overview#vpc-name-resolution-order
I'm surprised there is no /etc/resolve.conf
in your guest vm's.
Your /run/systemd/resolve/resolv.conf
should be equal to the /etc/resolv.conf
inside of the wrapper containers for your ignite vm's.
DCHPv4 is then used to communicate those settings to the guests.
Are you using the docker or containerd backend for ignite? Docker has its own codepath to ensure the container's DNS settings. The code for ignite+containerd is here: https://github.com/weaveworks/ignite/blob/f3bb9de/pkg/resolvconf/resolvconf.go#L28-L50
Can you check that the wrapper containers have the right /etc/resolv.conf
?
That will help us know if the problem is the ignite CLI or the in-container DHCPv4+guest-config.
Hello,
Can you check that the wrapper containers have the right
/etc/resolv.conf
? That will help us know if the problem is the ignite CLI or the in-container DHCPv4+guest-config.
Do I ssh into the vm for this?
@rugwirobaker You can't escape the vm into the wrapper-container.
You can use the ctr
or docker
command line tools from the host to exec in depending on your runtime:
# containerd example:
sudo ctr -n firecracker task exec --exec-id="${RANDOM}" \
--tty ignite-6471867f590ea14b cat /etc/resolv.conf
# 6471867f590ea14b is the ignite VM ID
This could be related to and potentially fixed by #581
I'm going to see it works this weekend.
Excellent -- thanks and good luck! Let us know if you get it working