kilo
kilo copied to clipboard
kilo create bridge interafce only on one of the k8s nodes
2 node k8s cluster created via kubeadm. Nodes are placed in different availability zones, have only dedicated external IP addresses (no private networks attached, etc.).
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:09:38Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
# docker version
Client: Docker Engine - Community
Version: 19.03.15
API version: 1.40
Go version: go1.13.15
Git commit: 99e3ed8919
Built: Sat Jan 30 03:16:51 2021
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.15
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 99e3ed8919
Built: Sat Jan 30 03:15:20 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.18.0
GitCommit: fec3683
kubeadm init command (other params are default):
kubeadm init --pod-network-cidr "10.10.0.0/16"
Initial k8s cluster status (no CNI):
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
test-kilo-0 NotReady control-plane,master 2m16s v1.20.4 194.182.164.214 <none> Ubuntu 18.04.5 LTS 4.15.0-136-generic docker://19.3.15
test-kilo-1 NotReady worker 49s v1.20.4 185.19.28.241 <none> Ubuntu 18.04.5 LTS 4.15.0-136-generic docker://19.3.15
# kubectl get po -o wide -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-74ff55c5b-6gvdk 0/1 Pending 0 4m5s <none> <none> <none> <none>
kube-system coredns-74ff55c5b-hfd5l 0/1 Pending 0 4m5s <none> <none> <none> <none>
kube-system etcd-test-kilo-0 1/1 Running 0 4m19s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-apiserver-test-kilo-0 1/1 Running 1 4m19s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-controller-manager-test-kilo-0 1/1 Running 0 4m19s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-proxy-nhrzd 1/1 Running 0 2m56s 185.19.28.241 test-kilo-1 <none> <none>
kube-system kube-proxy-pnsb8 1/1 Running 0 4m5s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-scheduler-test-kilo-0 1/1 Running 0 4m19s 194.182.164.214 test-kilo-0 <none> <none>
Using kilo-kubeadm.yaml to install kilo. Here is the result:
# kubectl get po -o wide -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-74ff55c5b-6gvdk 0/1 Running 0 6m25s 10.10.1.2 test-kilo-1 <none> <none>
kube-system coredns-74ff55c5b-hfd5l 0/1 Running 0 6m25s 10.10.1.3 test-kilo-1 <none> <none>
kube-system etcd-test-kilo-0 1/1 Running 0 6m39s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kilo-cthl8 1/1 Running 0 72s 185.19.28.241 test-kilo-1 <none> <none>
kube-system kilo-rk4vf 1/1 Running 0 72s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-apiserver-test-kilo-0 1/1 Running 1 6m39s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-controller-manager-test-kilo-0 1/1 Running 0 6m39s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-proxy-nhrzd 1/1 Running 0 5m16s 185.19.28.241 test-kilo-1 <none> <none>
kube-system kube-proxy-pnsb8 1/1 Running 0 6m25s 194.182.164.214 test-kilo-0 <none> <none>
kube-system kube-scheduler-test-kilo-0 1/1 Running 0 6m39s 194.182.164.214 test-kilo-0 <none> <none>
Looks like configuring cni is stuck at some step. If we check kilo logs we can see that config on node1 is incomplete.
# kubectl -n kube-system logs kilo-cthl8
{"caller":"mesh.go:96","component":"kilo","level":"warn","msg":"no private key found on disk; generating one now","ts":"2021-03-03T10:18:49.037598301Z"}
{"caller":"main.go:221","msg":"Starting Kilo network mesh '2b959f7020a8dbb6b32860965ed4dbfd0dd11215'.","ts":"2021-03-03T10:18:49.078808548Z"}
{"caller":"cni.go:60","component":"kilo","err":"failed to read IPAM config from CNI config list file: no IP ranges specified","level":"warn","msg":"failed to get CIDR from CNI file; overwriting it","ts":"2021-03-03T10:18:49.180229464Z"}
{"caller":"cni.go:68","component":"kilo","level":"info","msg":"CIDR in CNI file is empty","ts":"2021-03-03T10:18:49.180312275Z"}
{"CIDR":"10.10.1.0/24","caller":"cni.go:73","component":"kilo","level":"info","msg":"setting CIDR in CNI file","ts":"2021-03-03T10:18:49.180347087Z"}
{"caller":"mesh.go:532","component":"kilo","level":"info","msg":"WireGuard configurations are different","ts":"2021-03-03T10:18:49.559865866Z"}
{"caller":"mesh.go:301","component":"kilo","event":"add","level":"info","node":{"Endpoint":{"DNS":"","IP":"194.182.164.214","Port":51820},"Key":"OURzdHQwdFJBWU5PRzUxTVFTZE9ZaVFWUnp1NHNxS3ZKdEdvZGtGK2huaz0=","InternalIP":null,"LastSeen":1614766724,"Leader":false,"Location":"","Name":"test-kilo-0","PersistentKeepalive":0,"Subnet":{"IP":"10.10.0.0","Mask":"////AA=="},"WireGuardIP":{"IP":"10.4.0.1","Mask":"//8AAA=="}},"ts":"2021-03-03T10:18:49.587900799Z"}
{"caller":"mesh.go:532","component":"kilo","level":"info","msg":"WireGuard configurations are different","ts":"2021-03-03T10:18:49.727266934Z"}
# kubectl -n kube-system logs kilo-rk4vf
{"caller":"mesh.go:96","component":"kilo","level":"warn","msg":"no private key found on disk; generating one now","ts":"2021-03-03T10:18:43.12428951Z"}
{"caller":"main.go:221","msg":"Starting Kilo network mesh '2b959f7020a8dbb6b32860965ed4dbfd0dd11215'.","ts":"2021-03-03T10:18:43.149461372Z"}
{"caller":"cni.go:60","component":"kilo","err":"failed to read IPAM config from CNI config list file: no IP ranges specified","level":"warn","msg":"failed to get CIDR from CNI file; overwriting it","ts":"2021-03-03T10:18:43.250712032Z"}
{"caller":"cni.go:68","component":"kilo","level":"info","msg":"CIDR in CNI file is empty","ts":"2021-03-03T10:18:43.251153965Z"}
{"CIDR":"10.10.0.0/24","caller":"cni.go:73","component":"kilo","level":"info","msg":"setting CIDR in CNI file","ts":"2021-03-03T10:18:43.251455081Z"}
E0303 10:18:43.282674 1 reflector.go:126] pkg/k8s/backend.go:407: Failed to list *v1alpha1.Peer: the server could not find the requested resource (get peers.kilo.squat.ai)
{"caller":"mesh.go:532","component":"kilo","level":"info","msg":"WireGuard configurations are different","ts":"2021-03-03T10:18:44.649959279Z"}
{"caller":"mesh.go:301","component":"kilo","event":"update","level":"info","node":{"Endpoint":{"DNS":"","IP":"185.19.28.241","Port":51820},"Key":"TjJsb0Z1eG51N2dVM21yS2VHRVAyV0thN2MxdkFIU0piVWZwZ2ZZT09qOD0=","InternalIP":null,"LastSeen":1614766729,"Leader":false,"Location":"","Name":"test-kilo-1","PersistentKeepalive":0,"Subnet":{"IP":"10.10.1.0","Mask":"////AA=="},"WireGuardIP":null},"ts":"2021-03-03T10:18:49.31665582Z"}
{"caller":"mesh.go:532","component":"kilo","level":"info","msg":"WireGuard configurations are different","ts":"2021-03-03T10:18:49.437484515Z"}
{"caller":"mesh.go:301","component":"kilo","event":"update","level":"info","node":{"Endpoint":{"DNS":"","IP":"185.19.28.241","Port":51820},"Key":"TjJsb0Z1eG51N2dVM21yS2VHRVAyV0thN2MxdkFIU0piVWZwZ2ZZT09qOD0=","InternalIP":null,"LastSeen":1614766729,"Leader":false,"Location":"","Name":"test-kilo-1","PersistentKeepalive":0,"Subnet":{"IP":"10.10.1.0","Mask":"////AA=="},"WireGuardIP":{"IP":"10.4.0.2","Mask":"//8AAA=="}},"ts":"2021-03-03T10:18:49.862060523Z"}
Wireguard looks ok on both nodes:
node1:# wg
interface: kilo0
public key: 9Dstt0tRAYNOG51MQSdOYiQVRzu4sqKvJtGodkF+hnk=
private key: (hidden)
listening port: 51820
peer: N2loFuxnu7gU3mrKeGEP2WKa7c1vAHSJbUfpgfYOOj8=
endpoint: 185.19.28.241:51820
allowed ips: 10.10.1.0/24, 10.4.0.2/32
node2:# wg
interface: kilo0
public key: N2loFuxnu7gU3mrKeGEP2WKa7c1vAHSJbUfpgfYOOj8=
private key: (hidden)
listening port: 51820
peer: 9Dstt0tRAYNOG51MQSdOYiQVRzu4sqKvJtGodkF+hnk=
endpoint: 194.182.164.214:51820
allowed ips: 10.10.0.0/24, 10.4.0.1/32
Let's check network interfaces and here is the problem. WG tunnel is ok, but bridge interface is not created on node1
node1: # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 06:53:98:00:0d:ba brd ff:ff:ff:ff:ff:ff
inet 194.182.164.214/22 brd 194.182.167.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::453:98ff:fe00:dba/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:5c:f9:e9:5b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
4: kilo0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
link/none
inet 10.4.0.1/16 brd 10.4.255.255 scope global kilo0
valid_lft forever preferred_lft forever
node2:# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 06:fa:0a:00:01:2e brd ff:ff:ff:ff:ff:ff
inet 185.19.28.241/22 brd 185.19.31.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::4fa:aff:fe00:12e/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:b9:77:a3:dd brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
4: kilo0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
link/none
inet 10.4.0.2/16 brd 10.4.255.255 scope global kilo0
valid_lft forever preferred_lft forever
5: kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1420 qdisc noqueue state UP group default qlen 1000
link/ether 52:1b:ad:e0:59:3b brd ff:ff:ff:ff:ff:ff
inet 10.10.1.1/24 scope global kube-bridge
valid_lft forever preferred_lft forever
inet6 fe80::501b:adff:fee0:593b/64 scope link
valid_lft forever preferred_lft forever
6: vethd21e7af3@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1420 qdisc noqueue master kube-bridge state UP group default
link/ether 8a:2a:7e:4e:9b:4e brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::882a:7eff:fe4e:9b4e/64 scope link
valid_lft forever preferred_lft forever
7: vethddbcea8c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1420 qdisc noqueue master kube-bridge state UP group default
link/ether 26:5c:ee:4c:40:b7 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::245c:eeff:fe4c:40b7/64 scope link
valid_lft forever preferred_lft forever
node1:# cat /etc/cni/net.d/10-kilo.conflist
{"cniVersion":"0.3.1","name":"kilo","plugins":[{"bridge":"kube-bridge","forceAddress":true,"ipam":{"ranges":[[{"subnet":"10.10.0.0/24"}]],"type":"host-local"},"isDefaultGateway":true,"mtu":1420,"name":"kubernetes","type":"bridge"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"}]}
node2:# cat /etc/cni/net.d/10-kilo.conflist
{"cniVersion":"0.3.1","name":"kilo","plugins":[{"bridge":"kube-bridge","forceAddress":true,"ipam":{"ranges":[[{"subnet":"10.10.1.0/24"}]],"type":"host-local"},"isDefaultGateway":true,"mtu":1420,"name":"kubernetes","type":"bridge"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"}]}
Ack thanks a lot for reporting this. You provided tons of helpful details. Could you share some additional pieces of info:
- What tag of the squat/kilo image are you using?
- The kube-bridge interface is created by the kubelet, which reads the CNI config written by Kilo; can you share the kubelet logs for node-1?
- Is this fixed by restarting node-1 or the kubelet process? If so, this really indicates some kubelet bug to me
- Guess it would be better if I paste the whole pod description here.
# kubectl -n kube-system describe po kilo-8smsx
Name: kilo-8smsx
Namespace: kube-system
Priority: 0
Node: test-kilo-0/159.100.245.14
Start Time: Wed, 03 Mar 2021 10:54:56 +0000
Labels: app.kubernetes.io/name=kilo
controller-revision-hash=597846cbb6
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 159.100.245.14
IPs:
IP: 159.100.245.14
Controlled By: DaemonSet/kilo
Init Containers:
install-cni:
Container ID: docker://fd64ec764c26f171813f1a675f5320d06f3f8511bcd8a896e9369dc8a719bfe2
Image: squat/kilo
Image ID: docker-pullable://squat/kilo@sha256:05dcc0b50e597345a9b8afc2bb5c5eb633c205e5bd178bd257383f885cdf5ba2
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
set -e -x; cp /opt/cni/bin/* /host/opt/cni/bin/; TMP_CONF="$CNI_CONF_NAME".tmp; echo "$CNI_NETWORK_CONFIG" > $TMP_CONF; rm -f /host/etc/cni/net.d/*; mv $TMP_CONF /host/etc/cni/net.d/$CNI_CONF_NAME
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 03 Mar 2021 10:55:02 +0000
Finished: Wed, 03 Mar 2021 10:55:02 +0000
Ready: True
Restart Count: 0
Environment:
CNI_CONF_NAME: 10-kilo.conflist
CNI_NETWORK_CONFIG: <set to the key 'cni-conf.json' of config map 'kilo'> Optional: false
Mounts:
/host/etc/cni/net.d from cni-conf-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kilo-token-4qmzq (ro)
Containers:
kilo:
Container ID: docker://81b344c3451e93473974bb61fac34a9d81c57b09e4aa26045e5aa458d99408aa
Image: squat/kilo
Image ID: docker-pullable://squat/kilo@sha256:05dcc0b50e597345a9b8afc2bb5c5eb633c205e5bd178bd257383f885cdf5ba2
Port: <none>
Host Port: <none>
Args:
--kubeconfig=/etc/kubernetes/kubeconfig
--hostname=$(NODE_NAME)
State: Running
Started: Wed, 03 Mar 2021 10:55:04 +0000
Ready: True
Restart Count: 0
Environment:
NODE_NAME: (v1:spec.nodeName)
Mounts:
/etc/cni/net.d from cni-conf-dir (rw)
/etc/kubernetes from kubeconfig (ro)
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/var/lib/kilo from kilo-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kilo-token-4qmzq (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-conf-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
kilo-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kilo
HostPathType:
kubeconfig:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-proxy
Optional: false
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
kilo-token-4qmzq:
Type: Secret (a volume populated by a Secret)
SecretName: kilo-token-4qmzq
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m40s default-scheduler Successfully assigned kube-system/kilo-8smsx to test-kilo-0
Warning FailedMount 4m40s kubelet MountVolume.SetUp failed for volume "kilo-token-4qmzq" : failed to sync secret cache: timed out waiting for the condition
Normal Pulling 4m39s kubelet Pulling image "squat/kilo"
Normal Pulled 4m35s kubelet Successfully pulled image "squat/kilo" in 3.349492743s
Normal Created 4m35s kubelet Created container install-cni
Normal Started 4m35s kubelet Started container install-cni
Normal Pulling 4m35s kubelet Pulling image "squat/kilo"
Normal Pulled 4m33s kubelet Successfully pulled image "squat/kilo" in 1.646763634s
Normal Created 4m33s kubelet Created container kilo
Normal Started 4m33s kubelet Started container kilo
- There are tons of logs, here are the most useful I found:
Mar 03 10:54:50 test-kilo-0 kubelet[4618]: W0303 10:54:50.652407 4618 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
Mar 03 10:54:52 test-kilo-0 kubelet[4618]: E0303 10:54:52.043437 4618 kubelet.go:2184] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Mar 03 10:54:55 test-kilo-0 kubelet[4618]: W0303 10:54:55.652779 4618 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.523457 4618 topology_manager.go:187] [topologymanager] Topology Admit Handler
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: E0303 10:54:56.533865 4618 reflector.go:138] object-"kube-system"/"kilo": Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "kilo" is forbidden: User "system:node:test-kilo-0" cannot list resource "configmaps" in API group "" in the namespace "kube-system": no relationship found between node 'test-kilo-0' and this object
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: E0303 10:54:56.533931 4618 reflector.go:138] object-"kube-system"/"kilo-token-4qmzq": Failed to watch *v1.Secret: failed to list *v1.Secret: secrets "kilo-token-4qmzq" is forbidden: User "system:node:test-kilo-0" cannot list resource "secrets" in API group "" in the namespace "kube-system": no relationship found between node 'test-kilo-0' and this object
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644244 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/configmap/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-kubeconfig") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644318 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-lib-modules") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644341 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-xtables-lock") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644420 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "kilo-dir" (UniqueName: "kubernetes.io/host-path/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-kilo-dir") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644458 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "kilo-token-4qmzq" (UniqueName: "kubernetes.io/secret/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-kilo-token-4qmzq") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644480 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-cni-bin-dir") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:56 test-kilo-0 kubelet[4618]: I0303 10:54:56.644506 4618 reconciler.go:224] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-conf-dir" (UniqueName: "kubernetes.io/host-path/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-cni-conf-dir") pod "kilo-8smsx" (UID: "6f2aa1fb-6811-4dbf-849b-0cbfb740d25f")
Mar 03 10:54:57 test-kilo-0 kubelet[4618]: E0303 10:54:57.056909 4618 kubelet.go:2184] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Mar 03 10:54:57 test-kilo-0 kubelet[4618]: E0303 10:54:57.745928 4618 secret.go:195] Couldn't get secret kube-system/kilo-token-4qmzq: failed to sync secret cache: timed out waiting for the condition
Mar 03 10:54:57 test-kilo-0 kubelet[4618]: E0303 10:54:57.746888 4618 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/secret/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-kilo-token-4qmzq podName:6f2aa1fb-6811-4dbf-849b-0cbfb740d25f nodeName:}" failed. No retries permitted until 2021-03-03 10:54:58.246746213 +0000 UTC m=+222.879815780 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"kilo-token-4qmzq\" (UniqueName: \"kubernetes.io/secret/6f2aa1fb-6811-4dbf-849b-0cbfb740d25f-kilo-token-4qmzq\") pod \"kilo-8smsx\" (UID: \"6f2aa1fb-6811-4dbf-849b-0cbfb740d25f\") : failed to sync secret cache: timed out waiting for the condition"
Mar 03 10:55:00 test-kilo-0 kubelet[4618]: W0303 10:55:00.653935 4618 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
Mar 03 10:55:02 test-kilo-0 kubelet[4618]: E0303 10:55:02.092521 4618 kubelet.go:2184] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
- Reboot or any service resart doesn't help. Cluster remains in the same state.
@3rmack thanks for the quick reply. It's comforting that restarting doesn't resolve the issue, otherwise we might not have a convincing solution.
The kubelet seems to complain that it can't find any configuration in the CNI directory. Indeed, it seems that the Kilo manifests for kubeadm install the CNI configuration in the wrong directory. They are using /etc/kubernetes/cni/net.d
[0] when they should be using /etc/cni/net.d
[1].
Can you try redeploying the Kilo DaemonSet with the corrected host path?
If this fixes the issue, then please submit a PR if you can :)
[0] https://github.com/squat/kilo/blob/main/manifests/kilo-kubeadm.yaml#L163 [1] https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#cni
Actually, it is already deployed with the correct path /etc/cni/net.d
. It was found and corrected from the very beginning of testing kilo.
Hmm in that case we'll need a bit more inspection. Can you please share the output of cat /etc/cni/net.d/10-kilo.conflist
from the broken node? I need to double check nothing changed in v1.20 that prevents CNI v0.3.1 from being read.
You wrote:
If we check kilo logs we can see that config on node1 is incomplete.
What do you mean? I didn't see any logs from Kilo that imply that.
Took a look and CNI v0.3.1 should still work. We need to ensure that the CNI configuration file exists for the broken node and check if there are recent kubelet logs that continue to complain about no networks found
What do you mean? I didn't see any logs from Kilo that imply that.
I want to say that if we compare logs from both kilo pods - logs from pod on "failed" node are shorter. There is an extra "update" event in logs on "ok" node.
We need to ensure that the CNI configuration file exists for the broken node and check if there are recent kubelet logs that continue to complain about
no networks found
Config files present on both nodes. Please check my initial post - outputs of those files are in the very end of it.
Kubelete logs are complainnig about no networks found
with existing cni config in /etc/cni/net.d
dir.
Ack, yes i somehow missed them earlier. Thanks. I spun up a two node kubeadm cluster last night and haven't been able to replicate this issue yet.
I spun up a two nice kubeadm cluster last night and haven't been able to replicate this issue yet.
Also worth mentioning that it could be hw/os/etc issue on cloud provider where I tested this. Ubuntu servers where created from from cloud provider's templates.
I haven't been able to replicate this :/ this seems like it may be an issue with the specific environment, perhaps a container runtime issue
I get the same err. Can i get any help?
root@RJYF-P-337:/etc/cni/net.d#
root@RJYF-P-337:/etc/cni/net.d# cat 10-kilo.conflist |jq
{
"cniVersion": "0.3.1",
"name": "kilo",
"plugins": [
{
"bridge": "kube-bridge",
"forceAddress": true,
"ipam": {
"ranges": [
[
{
"subnet": "10.1.0.0/24"
}
]
],
"type": "host-local"
},
"isDefaultGateway": true,
"mtu": 1420,
"name": "kubernetes",
"type": "bridge"
},
{
"capabilities": {
"portMappings": true
},
"snat": true,
"type": "portmap"
}
]
}
root@RJYF-P-337:/etc/cni/net.d#
root@RJYF-P-337:/etc/cni/net.d#
root@RJYF-P-337:/etc/cni/net.d# kubectl logs -f -n kube-system kilo-lcjvb kilo
{"caller":"main.go:277","msg":"Starting Kilo network mesh 'a1af9790ea541c683d528d5a1d23075528d682d4'.","ts":"2022-03-25T06:58:31.331505641Z"}
{"caller":"cni.go:61","component":"kilo","err":"failed to read IPAM config from CNI config list file: no IP ranges specified","level":"warn","msg":"failed to get CIDR from CNI file; overwriting it","ts":"2022-03-25T06:58:31.432995767Z"}
{"caller":"cni.go:69","component":"kilo","level":"info","msg":"CIDR in CNI file is empty","ts":"2022-03-25T06:58:31.433046208Z"}
{"CIDR":"10.1.0.0/24","caller":"cni.go:74","component":"kilo","level":"info","msg":"setting CIDR in CNI file","ts":"2022-03-25T06:58:31.43305818Z"}
{"caller":"mesh.go:375","component":"kilo","level":"info","msg":"overriding endpoint","new endpoint":"172.20.60.28:51820","node":"rjyf-p-337","old endpoint":"","ts":"2022-03-25T06:58:31.541709926Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:58:31.555442689Z"}
{"caller":"mesh.go:309","component":"kilo","event":"add","level":"info","node":{"Endpoint":{},"Key":[27,123,34,254,51,164,151,222,139,112,14,118,233,72,232,252,215,192,141,112,145,225,11,124,100,1,92,187,19,84,89,108],"NoInternalIP":false,"InternalIP":{"IP":"10.2.0.1","Mask":"/////w=="},"LastSeen":1648191504,"Leader":false,"Location":"","Name":"lc","PersistentKeepalive":0,"Subnet":{"IP":"10.1.3.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":"full"},"ts":"2022-03-25T06:58:31.555600099Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:58:31.556817226Z"}
{"caller":"mesh.go:309","component":"kilo","event":"add","level":"info","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"gcp","Name":"rjyf-p-335","PersistentKeepalive":0,"Subnet":{"IP":"10.1.1.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2022-03-25T06:58:31.556912803Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:58:31.557776266Z"}
{"caller":"mesh.go:309","component":"kilo","event":"add","level":"info","node":{"Endpoint":{},"Key":[199,66,125,140,234,59,65,207,73,92,126,95,247,144,33,194,75,219,98,104,213,187,67,24,129,193,0,124,228,8,160,31],"NoInternalIP":false,"InternalIP":{"IP":"172.20.60.31","Mask":"///8AA=="},"LastSeen":1648191502,"Leader":false,"Location":"gcp","Name":"rjyf-p-336","PersistentKeepalive":0,"Subnet":{"IP":"10.1.2.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":"full"},"ts":"2022-03-25T06:58:31.557862063Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:58:31.55877808Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:59:01.543738566Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T06:59:31.545704256Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:00:01.547772771Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:00:31.550088195Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:01:01.551853854Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:01:31.554154067Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:02:01.556278704Z"}
{"caller":"mesh.go:309","component":"kilo","event":"update","level":"info","node":{"Endpoint":null,"Key":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"gcp","Name":"rjyf-p-335","PersistentKeepalive":0,"Subnet":{"IP":"10.1.1.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2022-03-25T07:02:14.831024851Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:02:14.832378096Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:02:31.558733749Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:03:01.560622049Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:03:31.563116772Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:04:01.565075605Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:04:31.568063262Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:05:01.57004051Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:05:31.571529246Z"}
{"caller":"mesh.go:482","component":"kilo","error":"file does not exist","level":"error","ts":"2022-03-25T07:06:01.573270241Z"}
^C
root@RJYF-P-337:/etc/cni/net.d#
It is my yaml :
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kilo
namespace: kube-system
labels:
app.kubernetes.io/name: kilo
app.kubernetes.io/part-of: kilo
spec:
selector:
matchLabels:
app.kubernetes.io/name: kilo
app.kubernetes.io/part-of: kilo
template:
metadata:
labels:
app.kubernetes.io/name: kilo
app.kubernetes.io/part-of: kilo
spec:
serviceAccountName: kilo
hostNetwork: true
containers:
- name: boringtun
image: leonnicolas/boringtun
args:
- --disable-drop-privileges=true
- --foreground
- kilo0
securityContext:
privileged: true
volumeMounts:
- name: wireguard
mountPath: /var/run/wireguard
readOnly: false
- name: kilo
image: squat/kilo
args:
- --kubeconfig=/etc/kubernetes/kubeconfig
- --hostname=$(NODE_NAME)
- --create-interface=false
- --interface=kilo0
- --mesh-granularity=full
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
ports:
- containerPort: 1107
name: metrics
securityContext:
privileged: true
volumeMounts:
- name: cni-conf-dir
mountPath: /etc/cni/net.d
- name: kilo-dir
mountPath: /var/lib/kilo
- name: kubeconfig
mountPath: /etc/kubernetes
readOnly: true
- name: lib-modules
mountPath: /lib/modules
readOnly: true
- name: xtables-lock
mountPath: /run/xtables.lock
readOnly: false
initContainers:
- name: install-cni
image: squat/kilo
command:
- /bin/sh
- -c
- set -e -x;
cp /opt/cni/bin/* /host/opt/cni/bin/;
TMP_CONF="$CNI_CONF_NAME".tmp;
echo "$CNI_NETWORK_CONFIG" > $TMP_CONF;
rm -f /host/etc/cni/net.d/*;
mv $TMP_CONF /host/etc/cni/net.d/$CNI_CONF_NAME
env:
- name: CNI_CONF_NAME
value: 10-kilo.conflist
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: kilo
key: cni-conf.json
volumeMounts:
- name: cni-bin-dir
mountPath: /host/opt/cni/bin
- name: cni-conf-dir
mountPath: /host/etc/cni/net.d
tolerations:
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
volumes:
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-conf-dir
hostPath:
path: /etc/cni/net.d
- name: kilo-dir
hostPath:
path: /var/lib/kilo
- name: kubeconfig
configMap:
name: kube-proxy
items:
- key: kubeconfig.conf
path: kubeconfig
- name: lib-modules
hostPath:
path: /lib/modules
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: wireguard
hostPath:
path: /var/run/wireguard
kubernetes version: v1.20.9 containerd version: v1.5.2 os: ubuntu:18.04
@hhstu this seems like the kilo0 device is not available / doesn't exist. Are there any logs from the boringtun container?
@hhstu this seems like the kilo0 device is not available / doesn't exist. Are there any logs from the boringtun container?
just .. @squat
root@RJYF-P-337:/etc/cni/net.d# kubectl logs -f -n kube-system kilo-tgmnz boringtun
2022-03-25T07:23:39.490195Z INFO boringtun_cli: BoringTun started successfully
at boringtun-cli/src/main.rs:178
Hmmm can you please show a list of the devices available in the erroring Kilo Pod? ip l
@squat
root@RJYF-P-337:/etc/cni/net.d# kubectl exec -it -n kube-system kilo-tgmnz -c kilo ip a
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:95:d4:7c brd ff:ff:ff:ff:ff:ff
inet 172.20.60.28/22 brd 172.20.63.255 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe95:d47c/64 scope link
valid_lft forever preferred_lft forever
3: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN
link/ether ba:35:6f:9a:01:f1 brd ff:ff:ff:ff:ff:ff
inet 10.2.0.10/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
inet 10.2.0.1/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
4: kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1420 qdisc noqueue state UP qlen 1000
link/ether 82:7f:e9:a4:2f:9f brd ff:ff:ff:ff:ff:ff
inet 10.1.0.1/24 brd 10.1.0.255 scope global kube-bridge
valid_lft forever preferred_lft forever
inet6 fe80::a0ae:11ff:fee8:a477/64 scope link
valid_lft forever preferred_lft forever
17: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
31: kilo0: <POINTOPOINT,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN qlen 500
link/[65534]
32: vethd3527ba5@kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1420 qdisc noqueue master kube-bridge state UP
link/ether 82:7f:e9:a4:2f:9f brd ff:ff:ff:ff:ff:ff
inet6 fe80::807f:e9ff:fea4:2f9f/64 scope link
valid_lft forever preferred_lft forever
root@RJYF-P-337:/etc/cni/net.d#
Thanks @hhstu so there is indeed a kilo0 interface available. Some things that come to mind:
What differences are there between this node and the one that is working? Different OS? OS version? Hardware? One uses boringtun the other doesn't? Etc knowing the differences may help determine why this works on one machine but not the other
This is a new cluster of kubedm. All pods of kilo are not work! I have never do it well @squat
It is my kubeadm-config
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.20.60.28
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.20.9
controlPlaneEndpoint: apiserver.cluster.local:6443
imageRepository: 172.20.60.28:5000/ccs
networking:
dnsDomain: cluster.local
podSubnet: 10.1.0.0/16
serviceSubnet: 10.2.0.0/16
apiServer:
certSANs:
- 127.0.0.1
- apiserver.cluster.local
- 172.20.60.28
- 172.20.60.31
- 172.20.60.32
- 10.103.97.2
extraArgs:
feature-gates: TTLAfterFinished=true,RemoveSelfLink=false
max-mutating-requests-inflight: "4000"
max-requests-inflight: "8000"
default-unreachable-toleration-seconds: "2"
extraVolumes:
- name: localtime
hostPath: /etc/localtime
mountPath: /etc/localtime
readOnly: true
pathType: File
controllerManager:
extraArgs:
bind-address: 0.0.0.0
secure-port: "10257"
port: "10252"
kube-api-burst: "100"
kube-api-qps: "50"
feature-gates: TTLAfterFinished=true,RemoveSelfLink=false
experimental-cluster-signing-duration: 876000h
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
pathType: File
scheduler:
extraArgs:
bind-address: 0.0.0.0
kube-api-burst: "100"
kube-api-qps: "50"
port: "10251"
secure-port: "10259"
feature-gates: TTLAfterFinished=true,RemoveSelfLink=false
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
pathType: File
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
metricsBindAddress: 0.0.0.0
bindAddress: 0.0.0.0
ipvs:
syncPeriod: 30s
minSyncPeriod: 5s
scheduler: rr
excludeCIDRs:
- 10.103.97.2/32
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
kubeAPIQPS: 40
kubeAPIBurst: 50
imageMinimumGCAge: 48h
imageGCHighThresholdPercent: 85
evictionHard:
imagefs.available: 5%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
Also, @hhstu does this work if you pin Kilo to 0.3.1?
I wonder if this might be due to the switch to using a different WireGuard client library.
After change to 0.3.1
root@RJYF-P-337:~# kubectl logs -f -n kube-system kilo-8zvcp kilo
{"caller":"main.go:221","msg":"Starting Kilo network mesh '0.3.1'.","ts":"2022-03-25T08:18:42.676269936Z"}
{"caller":"cni.go:60","component":"kilo","err":"failed to read IPAM config from CNI config list file: no IP ranges specified","level":"warn","msg":"failed to get CIDR from CNI file; overwriting it","ts":"2022-03-25T08:18:42.777749293Z"}
{"caller":"cni.go:68","component":"kilo","level":"info","msg":"CIDR in CNI file is empty","ts":"2022-03-25T08:18:42.777855805Z"}
{"CIDR":"10.1.2.0/24","caller":"cni.go:73","component":"kilo","level":"info","msg":"setting CIDR in CNI file","ts":"2022-03-25T08:18:42.777881579Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:18:42.903321326Z"}
{"caller":"mesh.go:297","component":"kilo","event":"add","level":"info","node":{"Endpoint":null,"Key":"","NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"gcp","Name":"rjyf-p-337","PersistentKeepalive":0,"Subnet":{"IP":"10.1.0.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2022-03-25T08:18:42.903400228Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:18:42.904926106Z"}
{"caller":"mesh.go:297","component":"kilo","event":"add","level":"info","node":{"Endpoint":null,"Key":"","NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"","Name":"lc","PersistentKeepalive":0,"Subnet":{"IP":"10.1.3.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2022-03-25T08:18:42.904978993Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:18:42.907857284Z"}
{"caller":"mesh.go:297","component":"kilo","event":"add","level":"info","node":{"Endpoint":null,"Key":"","NoInternalIP":false,"InternalIP":null,"LastSeen":0,"Leader":false,"Location":"gcp","Name":"rjyf-p-335","PersistentKeepalive":0,"Subnet":{"IP":"10.1.1.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":""},"ts":"2022-03-25T08:18:42.908017109Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:18:42.909536967Z"}
{"caller":"mesh.go:297","component":"kilo","event":"update","level":"info","node":{"Endpoint":{"DNS":"","IP":"172.20.60.32","Port":51820},"Key":"dHRvZ2VyaEZtMk5sczBtUTN2M0x6bFBLWWZ4R2dDQ0JobEtHZEZKVGFtaz0=","NoInternalIP":false,"InternalIP":{"IP":"172.20.60.32","Mask":"///8AA=="},"LastSeen":1648196324,"Leader":false,"Location":"gcp","Name":"rjyf-p-335","PersistentKeepalive":0,"Subnet":{"IP":"10.1.1.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":"full"},"ts":"2022-03-25T08:18:44.152516296Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:18:44.154150488Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:19:12.888451019Z"}
{"caller":"mesh.go:297","component":"kilo","event":"update","level":"info","node":{"Endpoint":{"DNS":"","IP":"172.10.97.10","Port":51820},"Key":"RzNzaS9qT2tsOTZMY0E1MjZVam8vTmZBalhDUjRRdDhaQUZjdXhOVVdXdz0=","NoInternalIP":false,"InternalIP":{"IP":"10.2.0.1","Mask":"/////w=="},"LastSeen":1648196362,"Leader":false,"Location":"","Name":"lc","PersistentKeepalive":0,"Subnet":{"IP":"10.1.3.0","Mask":"////AA=="},"WireGuardIP":null,"DiscoveredEndpoints":null,"AllowedLocationIPs":null,"Granularity":"full"},"ts":"2022-03-25T08:19:22.203607866Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:19:22.20554322Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:19:42.891505001Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:20:12.894562702Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:20:42.897019171Z"}
{"caller":"mesh.go:459","component":"kilo","error":"failed to read the WireGuard dump output: Unable to access interface: Protocol not supported\n","level":"error","ts":"2022-03-25T08:21:12.900503875Z"}
Thanks @hhstu those logs are a bit more helpful. Unable to access interface: Protocol not supported
seems to be a pretty common symptom of WireGuard problems.
Thinks @squat I will continue to check the problem
Thanks @hhstu those logs are a bit more helpful.
Unable to access interface: Protocol not supported
seems to be a pretty common symptom of WireGuard problems.
Can I add some use cases such as kubeadm-userspace.yaml,kubeadm-flannel-userspace.yaml with a pr ?
Hi @hhstu yes, I'd be very interested in taking a look at a PR for that 👍. I'm curious how/why it's different. Our E2E tests run on KinD, which uses kubeadm and we test userspace
There is nothing different,just my mistake i forget set the wiregard volume of kilo container. I hoped add the use cases kubeadm-userspace kubeadm-flannel-userspace for next person.