kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

unexpected lxd-provisioning behaviour

Open parallastra opened this issue 10 months ago • 1 comments

Describe the bug The README procedure works as expected until we run bootstrap script on all nodes,

$ cat bootstrap-kube.sh | lxc exec kworker1 bash [TASK 1] Install essential packages [TASK 2] Install containerd runtime [TASK 3] Set up kubernetes repo [TASK 4] Install Kubernetes components (kubeadm, kubelet and kubectl) [TASK 5] Enable ssh password authentication [TASK 6] Set root password [TASK 7] Join node to Kubernetes Cluster Error: Command not found

$ cat bootstrap-kube.sh | lxc exec kworker2 bash [TASK 1] Install essential packages [TASK 2] Install containerd runtime [TASK 3] Set up kubernetes repo [TASK 4] Install Kubernetes components (kubeadm, kubelet and kubectl) [TASK 5] Enable ssh password authentication [TASK 6] Set root password [TASK 7] Join node to Kubernetes Cluster Error: Command not found

How To Reproduce

  • git clone https://github.com/justmeandopensource/kubernetes.git
  • cd kubernetes/lxd-provisioning/
  • sudo apt-get update && sudo apt-get install lxc -y
  • sudo systemctl status lxc
  • lxd init (accepted defaults on all questions)
  • lxc profile create k8s
  • cat k8s-profile-config | lxc profile edit k8s
  • lxc profile list
  • lxc launch ubuntu:22.04 kmaster --profile k8s
  • lxc launch ubuntu:22.04 kworker1 --profile k8s
  • lxc launch ubuntu:22.04 kworker2 --profile k8s
  • sudo sysctl -w net.netfilter.nf_conntrack_max=524288
  • cat bootstrap-kube.sh | lxc exec kmaster bash
  • cat bootstrap-kube.sh | lxc exec kworker1 bash
  • cat bootstrap-kube.sh | lxc exec kworker2 bash

Expected behavior root@kmaster:~# kubectl get nodes NAME STATUS ROLES AGE VERSION kmaster Ready master 8m53s v1.19.2 kworker1 Ready 5m35s v1.19.2 kworker2 Ready 3m39s v1.19.2

Screenshots (if any) root@kmaster:~# kubectl get nodes NAME STATUS ROLES AGE VERSION kmaster NotReady control-plane 134m v1.29.3

Environment (please complete the following information): running a t2.medium instance on AWS

$ lsb_release -a && uname -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
Linux kmaster 6.5.0-1014-aws #14~22.04.1-Ubuntu SMP Thu Feb 15 15:27:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.3", GitCommit:"6813625b7cd706db5bc7388921be03071e1a492d", GitTreeState:"clean", BuildDate:"2024-03-15T00:06:16Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}

$ nproc
2
$ free -h
               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       1.6Gi       332Mi       4.0Mi       1.9Gi       1.9Gi
Swap:             0B          0B          0B

If one invokes lxc exec kmaster bash, /joincluster.sh can found, but is not executable, after chmod 755 /joincluster.sh, one observes the following:

$ /joincluster.sh 
[preflight] Running pre-flight checks
	[WARNING FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 6.5.0-1014-aws
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
CGROUPS_PIDS: enabled
CGROUPS_HUGETLB: enabled
CGROUPS_IO: enabled
	[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: FATAL: Module configs not found in directory /lib/modules/6.5.0-1014-aws\n", err: exit status 1
	[WARNING Port-10250]: Port 10250 is in use
	[WARNING FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

similar behaviour is observed if one calls the contents of /joincluster.sh on the worker nodes:

kubeadm join 10.181.232.175:6443 --token xlv0o6.e85vu38ygxuwts2y --discovery-token-ca-cert-hash sha256:9663e71b401bbd437ae3ec43fd06c29e1a2f6d6cd9de617ce192d74bff8db81a  --ignore-preflight-errors=all

kubectl get nodes does not behave as expected:

root@kmaster:~# kubectl get nodes
NAME       STATUS     ROLES           AGE     VERSION
kmaster    NotReady   control-plane   3h1m    v1.29.3
kworker1   NotReady   <none>          6m4s    v1.29.3
kworker2   NotReady   <none>          5m45s   v1.29.3
root@kmaster:~# kubectl -n kube-system get all
NAME                                  READY   STATUS             RESTARTS          AGE
pod/coredns-76f75df574-mqb8w          0/1     Pending            0                 19h
pod/coredns-76f75df574-x5jsv          0/1     Pending            0                 19h
pod/etcd-kmaster                      1/1     Running            0                 19h
pod/kube-apiserver-kmaster            1/1     Running            0                 19h
pod/kube-controller-manager-kmaster   1/1     Running            0                 19h
pod/kube-proxy-89f2w                  0/1     CrashLoopBackOff   198 (4m44s ago)   16h
pod/kube-proxy-wwwpj                  0/1     CrashLoopBackOff   236 (118s ago)    19h
pod/kube-proxy-z4rh4                  0/1     CrashLoopBackOff   198 (2m32s ago)   16h
pod/kube-scheduler-kmaster            1/1     Running            0                 19h

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   19h

NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/kube-proxy   3         3         0       3            0           kubernetes.io/os=linux   19h

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   0/2     2            0           19h

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-76f75df574   2         2         0       19h

parallastra avatar Apr 10 '24 20:04 parallastra

Can you post logs for kube-proxy pods?

Cloud-Mak avatar Aug 23 '24 18:08 Cloud-Mak