kilo icon indicating copy to clipboard operation
kilo copied to clipboard

microk8s compatibility

Open carlosrmendes opened this issue 4 years ago • 18 comments

Is in the roadmap the microk8s compatibility? Microk8s uses flannel as cni by default, I've tested on it but with no success. Kilo pods starts well and no errors are printed in the logs (w/ log-level=all) but the sudo wg command don't show anything, no public key nor endpoint... and in the node the kilo.squat.ai/wireguard-ip annotation shows no ip.

Can you please take a look on microk8s? I think is interesting now that microk8s stable version has the clustering option. Thanks in advance.

carlosrmendes avatar Apr 25 '20 19:04 carlosrmendes

I would love for kilo to work on microk8s :) the fact that no error is printed but the node is never fully configured suggests to me that kilo cannot find correct IPs assigned to the node's interfaces so the node is never ready as far as kilo is concerned. Can you post the annotations on the node? I don't have an Ubuntu box to test microk8s on but I'll see if I can spin one up this week and test myself :)

squat avatar Apr 28 '20 11:04 squat

annotations on the node in region "cloud" (first) and a another node (second), on another NAT'ed network: image

wg and ip a output on both nodes (left: cloud; right: NAT'ed): image

kilo logs on both left: cloud; right: NAT'ed): image

what the message on the logs received incomplete node means?

carlosrmendes avatar Apr 28 '20 23:04 carlosrmendes

hi @carlosrmendes thanks a lot for that info, it's super helpful :)

The received incomplete node message means that when the kilo agent listed the nodes from the API, a node was missing some data and so it was not considered ready. The completeness check (https://github.com/squat/kilo/blob/master/pkg/mesh/mesh.go#L90-L94) looks for the following data:

  • endpoint
  • internal IP
  • public key
  • recent heartbeat
  • pod subnet From your screenshots, it seems like the first four are definitely present in the annotations. The pod subnet is taken from the node's spec. Can you verify that all of the nodes have been allocated a pod subnet? Please share the output of:
kubectl get nodes -o=jsonpath="{.items[*]['spec.podCIDR']}"

squat avatar Apr 28 '20 23:04 squat

yes, that is the problem... the nodes on microk8s don't have the podCIDR in their spec... :/

carlosrmendes avatar Apr 29 '20 01:04 carlosrmendes

I think the podCIDR of the node is present in the /var/snap/microk8s/common/run/flannel/subnet.env file:

FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.8.1/24
FLANNEL_MTU=8951
FLANNEL_IPMASQ=false

carlosrmendes avatar Apr 29 '20 01:04 carlosrmendes

is manually set the podCIDR on nodes spec and the kilo starts working. :) it is possible to set the pod subnet in the as kg argument or env variable?

but is not getting the correct persistent-keepalive value... image

carlosrmendes avatar Apr 29 '20 01:04 carlosrmendes

hi @carlosrmendes thanks a lot for that info, it's super helpful :)

The received incomplete node message means that when the kilo agent listed the nodes from the API, a node was missing some data and so it was not considered ready. The completeness check (https://github.com/squat/kilo/blob/master/pkg/mesh/mesh.go#L90-L94) looks for the following data:

  • endpoint
  • internal IP
  • public key
  • recent heartbeat
  • pod subnet From your screenshots, it seems like the first four are definitely present in the annotations. The pod subnet is taken from the node's spec. Can you verify that all of the nodes have been allocated a pod subnet? Please share the output of:
kubectl get nodes -o=jsonpath="{.items[*]['spec.podCIDR']}"

@squat take a look on: https://github.com/kubernetes/kubernetes/issues/57130

carlosrmendes avatar Apr 29 '20 18:04 carlosrmendes

@carlosrmendes thanks for posting about the persistent-keepalive! That was indeed a bug. It's now fixed in master: https://github.com/squat/kilo/commit/e4829832c509f13f45f13f5bb0ef2131394b49bf

squat avatar Apr 30 '20 11:04 squat

perfect! thanks @squat 👌 and about the pod subnet discovery? it can only works reading the .spec.podCIDR from the node?

carlosrmendes avatar Apr 30 '20 13:04 carlosrmendes

yes, K8s still supports this today. I'm not sure what microk8s is doing, but pod CIDR allocation is turned on in the controller-manager by default on most kubernetes distributions. TAL at the controller-manager flags to enable this on microk8s --allocate-node-cidrs https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/#options

squat avatar Apr 30 '20 13:04 squat

I already test that flag on controller-manager and yes, the podCIDR was set, but is different from the flannel SUBNET assigned to the node 😥

carlosrmendes avatar Apr 30 '20 14:04 carlosrmendes

That is quite weird. By default, Flannel actually requires the node.spec.podCIDR field to be set as well https://github.com/coreos/flannel/blob/master/subnet/kube/kube.go#L233-L235

squat avatar Apr 30 '20 15:04 squat

It's possible that microk8s has configured flannel to use etcd as the data store instead of kubernetes, in which case the pod cidr will not be used or taken from the node object but rather saved in etcd. It looks like that configuration info can be found on disk: https://microk8s.io/docs/configuring-services#snapmicrok8sdaemon-flanneld

squat avatar Apr 30 '20 15:04 squat

yes, in microk8s flannel uses etcd, I tried with --kube-subnet-mgr flag but with that, flannel somehow needs authentication to make calls to the api server, because it is not running as pods (that can use service accounts)

carlosrmendes avatar Apr 30 '20 15:04 carlosrmendes

Hello @carlosrmendes , I don't know if this can help, I remember playing with Flanel and kubeadm some time ago.

I needed to give --pod-network-cidr=192.168.128.0/17 to kubeadm init and use the same value in flanel's net-conf.json for the Netwok key.

  net-conf.json: |
    {
      "Network": "192.168.128.0/17",
      "Backend": {
        "Type": "host-gw"
      }
    }

Maybe you need a similar config in microk8s so that flanel and the controller-manager uses the same CIDR.

JulienVdG avatar Apr 21 '21 12:04 JulienVdG

was this ever resolved ?

kampsv avatar Sep 13 '21 18:09 kampsv

Is this still in the roadmap?

facutk avatar Feb 08 '22 21:02 facutk

Is there a way to adapt the get started offered on the website to microk8s? Does still Microk8s use flannel?

SFxLabs avatar Nov 22 '22 04:11 SFxLabs