kubernetes
kubernetes copied to clipboard
unexpected lxd-provisioning behaviour
Describe the bug The README procedure works as expected until we run bootstrap script on all nodes,
$ cat bootstrap-kube.sh | lxc exec kworker1 bash [TASK 1] Install essential packages [TASK 2] Install containerd runtime [TASK 3] Set up kubernetes repo [TASK 4] Install Kubernetes components (kubeadm, kubelet and kubectl) [TASK 5] Enable ssh password authentication [TASK 6] Set root password [TASK 7] Join node to Kubernetes Cluster Error: Command not found
$ cat bootstrap-kube.sh | lxc exec kworker2 bash [TASK 1] Install essential packages [TASK 2] Install containerd runtime [TASK 3] Set up kubernetes repo [TASK 4] Install Kubernetes components (kubeadm, kubelet and kubectl) [TASK 5] Enable ssh password authentication [TASK 6] Set root password [TASK 7] Join node to Kubernetes Cluster Error: Command not found
How To Reproduce
- git clone https://github.com/justmeandopensource/kubernetes.git
- cd kubernetes/lxd-provisioning/
- sudo apt-get update && sudo apt-get install lxc -y
- sudo systemctl status lxc
- lxd init (accepted defaults on all questions)
- lxc profile create k8s
- cat k8s-profile-config | lxc profile edit k8s
- lxc profile list
- lxc launch ubuntu:22.04 kmaster --profile k8s
- lxc launch ubuntu:22.04 kworker1 --profile k8s
- lxc launch ubuntu:22.04 kworker2 --profile k8s
- sudo sysctl -w net.netfilter.nf_conntrack_max=524288
- cat bootstrap-kube.sh | lxc exec kmaster bash
- cat bootstrap-kube.sh | lxc exec kworker1 bash
- cat bootstrap-kube.sh | lxc exec kworker2 bash
Expected behavior
root@kmaster:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster Ready master 8m53s v1.19.2
kworker1 Ready
Screenshots (if any) root@kmaster:~# kubectl get nodes NAME STATUS ROLES AGE VERSION kmaster NotReady control-plane 134m v1.29.3
Environment (please complete the following information): running a t2.medium instance on AWS
$ lsb_release -a && uname -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
Linux kmaster 6.5.0-1014-aws #14~22.04.1-Ubuntu SMP Thu Feb 15 15:27:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.3", GitCommit:"6813625b7cd706db5bc7388921be03071e1a492d", GitTreeState:"clean", BuildDate:"2024-03-15T00:06:16Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}
$ nproc
2
$ free -h
total used free shared buff/cache available
Mem: 3.8Gi 1.6Gi 332Mi 4.0Mi 1.9Gi 1.9Gi
Swap: 0B 0B 0B
If one invokes lxc exec kmaster bash
, /joincluster.sh
can found, but is not executable, after chmod 755 /joincluster.sh
, one observes the following:
$ /joincluster.sh
[preflight] Running pre-flight checks
[WARNING FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 6.5.0-1014-aws
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
CGROUPS_PIDS: enabled
CGROUPS_HUGETLB: enabled
CGROUPS_IO: enabled
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: FATAL: Module configs not found in directory /lib/modules/6.5.0-1014-aws\n", err: exit status 1
[WARNING Port-10250]: Port 10250 is in use
[WARNING FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
similar behaviour is observed if one calls the contents of /joincluster.sh on the worker nodes:
kubeadm join 10.181.232.175:6443 --token xlv0o6.e85vu38ygxuwts2y --discovery-token-ca-cert-hash sha256:9663e71b401bbd437ae3ec43fd06c29e1a2f6d6cd9de617ce192d74bff8db81a --ignore-preflight-errors=all
kubectl get nodes does not behave as expected:
root@kmaster:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster NotReady control-plane 3h1m v1.29.3
kworker1 NotReady <none> 6m4s v1.29.3
kworker2 NotReady <none> 5m45s v1.29.3
root@kmaster:~# kubectl -n kube-system get all
NAME READY STATUS RESTARTS AGE
pod/coredns-76f75df574-mqb8w 0/1 Pending 0 19h
pod/coredns-76f75df574-x5jsv 0/1 Pending 0 19h
pod/etcd-kmaster 1/1 Running 0 19h
pod/kube-apiserver-kmaster 1/1 Running 0 19h
pod/kube-controller-manager-kmaster 1/1 Running 0 19h
pod/kube-proxy-89f2w 0/1 CrashLoopBackOff 198 (4m44s ago) 16h
pod/kube-proxy-wwwpj 0/1 CrashLoopBackOff 236 (118s ago) 19h
pod/kube-proxy-z4rh4 0/1 CrashLoopBackOff 198 (2m32s ago) 16h
pod/kube-scheduler-kmaster 1/1 Running 0 19h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 19h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-proxy 3 3 0 3 0 kubernetes.io/os=linux 19h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 0/2 2 0 19h
NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-76f75df574 2 2 0 19h
Can you post logs for kube-proxy pods?