kilo
kilo copied to clipboard
Running kilo on RKE deployed clusters
Hi
First of all thank you for your awesome work with this project, much appreciated.
We test kilo currently with our clusters that we deploy with RKE and import to rancher later on. We use it as CNI provider and in a full-mesh layout.
We used the kilo-k3.yaml as our reference and had to lower the mtu
setting in the cni-conf.json configmap to 1300. The rancher-node-agent tries to open a wss://
connection to the rancher server which did not succeed with the original 1420 setting. While the 1300 was just our first lucky shot, it might be worth further testing to be as high as possible but we had no problems with this setting so far. Do you think this is worth documenting in this project? If yes, could you suggest a good place (maybe another file in manifests
) so that I can suggest a PR?
Hi @jbrinksmeier thanks for raising this issue. This certainly seems like something worth documenting.
Do you have any idea why you needed to lower the MTU? Also, if you lowered the MTU in the CNI configuration, the change won't affect Pods running in the host networking namespace, so I wonder if these Pods will still have problems with large IP packets. Can you share the output of ip l
on one of your hosts?
Frankly, I have no idea why this websocket connection needed such a low(er) MTU. decreasing it was in fact a lucky shot as I had such issues with any kind of VPN software so far, that's why I gave it a shot when this particular issue came up. We are not done testing yet as I am on vacation right now, but so far we had no issues with the running cluster, operating fairly simple services as various databases and some php containers. Anyway, here's the output of ip l
of one host as requested:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 12:42:63:7c:30:c1 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether ae:ce:41:23:45:f0 brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:76:45:79:d0 brd ff:ff:ff:ff:ff:ff
83: kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 5e:64:6c:df:8a:86 brd ff:ff:ff:ff:ff:ff
85: kilo0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/none
86: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
88: vethdcaf3da5@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue master kube-bridge state UP mode DEFAULT group default
link/ether 62:fa:29:a7:9e:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0
148: veth5222bf5b@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue master kube-bridge state UP mode DEFAULT group default
link/ether a2:06:2e:fe:a3:85 brd ff:ff:ff:ff:ff:ff link-netnsid 1
161: veth78b9147f@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue master kube-bridge state UP mode DEFAULT group default
link/ether 5e:64:6c:df:8a:86 brd ff:ff:ff:ff:ff:ff link-netnsid 2
162: veth6c407d3d@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue master kube-bridge state UP mode DEFAULT group default
link/ether d2:08:43:6b:3c:6c brd ff:ff:ff:ff:ff:ff link-netnsid 3
164: veth6184d2bd@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc noqueue master kube-bridge state UP mode DEFAULT group default
link/ether b6:18:fd:86:41:77 brd ff:ff:ff:ff:ff:ff link-netnsid 4
I just noted the pods in host network
part, missed the importance of host-network in that regard. I will test these days if we are able to operate pods in host network properly
Could you please share the config for RKE setup? Thanks!
@squat So far we had no issues running pods in the host-network. It certainly seems to be only an issue for this websocket request to the rancher cluster to register which is a one time task.
Could you please share the config for RKE setup? Thanks!
@laci84 sure can. As mentioned this is simply the manifest from here: https://github.com/squat/kilo/blob/master/manifests/kilo-k3s.yaml with a changed mtu setting. here you go
apiVersion: v1
kind: ConfigMap
metadata:
name: kilo
namespace: kube-system
labels:
app.kubernetes.io/name: kilo
data:
cni-conf.json: |
{
"cniVersion":"0.3.1",
"name":"kilo",
"plugins":[
{
"name":"kubernetes",
"type":"bridge",
"bridge":"kube-bridge",
"isDefaultGateway":true,
"forceAddress":true,
"mtu": 1300,
"ipam":{
"type":"host-local"
}
},
{
"type":"portmap",
"snat":true,
"capabilities":{
"portMappings":true
}
}
]
}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kilo
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kilo
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- patch
- watch
- apiGroups:
- kilo.squat.ai
resources:
- peers
verbs:
- list
- update
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kilo
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kilo
subjects:
- kind: ServiceAccount
name: kilo
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kilo
namespace: kube-system
labels:
app.kubernetes.io/name: kilo
spec:
selector:
matchLabels:
app.kubernetes.io/name: kilo
template:
metadata:
labels:
app.kubernetes.io/name: kilo
spec:
serviceAccountName: kilo
hostNetwork: true
containers:
- name: kilo
image: squat/kilo
args:
- --kubeconfig=/etc/kubernetes/kubeconfig
- --hostname=$(NODE_NAME)
- --mesh-granularity=full
- --subnet=10.5.0.0/24
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
privileged: true
volumeMounts:
- name: cni-conf-dir
mountPath: /etc/cni/net.d
- name: kilo-dir
mountPath: /var/lib/kilo
- name: kubeconfig
mountPath: /etc/kubernetes/kubeconfig
readOnly: true
- name: lib-modules
mountPath: /lib/modules
readOnly: true
- name: xtables-lock
mountPath: /run/xtables.lock
readOnly: false
initContainers:
- name: install-cni
image: squat/kilo
command:
- /bin/sh
- -c
- set -e -x;
cp /opt/cni/bin/* /host/opt/cni/bin/;
TMP_CONF="$CNI_CONF_NAME".tmp;
echo "$CNI_NETWORK_CONFIG" > $TMP_CONF;
rm -f /host/etc/cni/net.d/*;
mv $TMP_CONF /host/etc/cni/net.d/$CNI_CONF_NAME
env:
- name: CNI_CONF_NAME
value: 10-kilo.conflist
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: kilo
key: cni-conf.json
volumeMounts:
- name: cni-bin-dir
mountPath: /host/opt/cni/bin
- name: cni-conf-dir
mountPath: /host/etc/cni/net.d
tolerations:
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
volumes:
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-conf-dir
hostPath:
path: /etc/cni/net.d
- name: kilo-dir
hostPath:
path: /var/lib/kilo
- name: kubeconfig
hostPath:
path: /etc/kubernetes/kilo_kube_conf.yaml
- name: lib-modules
hostPath:
path: /lib/modules
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate