calico pods report an error of `no route to host`
Expected Behavior
When configure the CPU irqaffinity in /etc/default/grub,the calico pods to run normally.
Current Behavior
When configure the CPU irqaffinity in /etc/default/grub,the calico pods crashloopback and report an error of no route to host.
what changed and calico apiserver logs
calico kube-controller logs
Possible Solution
Removing the kernel parameter CPU irqaffinity, calico will restore normal operation, but we need this parameter for CPU isolation to improve performance.
Steps to Reproduce (for bugs)
- install kubernetes and calico with kubeadm,cluster and calico is running
- config
irqaffinity=0,10kernel options. - reboot kubernetes node
- calico pods crashloopback and report an error of
no route to host
root@node31:~# cat /etc/default/grub | grep GRUB_CMDLINE_LINUX
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="irqaffinity=0,6 noirqbalance intel_iommu=on iommu=pt"
root@node31:~#
root@node31:~# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.0-102-generic root=/dev/mapper/ubuntu--vg-lv--0 ro irqaffinity=0,10 noirqbalance intel_iommu=on iommu=pt
and i also use reservedSystemCPUs in kubelet config for system progress
root@node31:~# cat /var/lib/kubelet/config.yaml |grep -i cpu
cpuCFSQuota: true
cpuCFSQuotaPeriod: 100ms
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 10s
cpu: 500m
reservedSystemCPUs: 0,10-19
cpu: 500m
Context
I need to isolate a portion of the exclusive CPU for the VPP application, so I use irqaffinity to concentrate CPU interrupts on other CPUs, eg 0 10.
Your Environment
- Calico version:
v3.26.1, install use helm with calico operator. - Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes with kubeadm,
v1.25.11, containterd,just one master node. - Operating System and version:
ubuntu 22.04 - Link to your project (optional):
i have same question Warning FailedCreatePodSandBox 114m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3d6733531c3caf74141893557e9d697f5ba909a3a08874087ed6892142153048" network for pod "calico-kube-controllers-7b84757b95-576fg": networkPlugin cni failed to set up pod "calico-kube-controllers-7b84757b95-576fg_kube-system" network: plugin type="calico" failed (add): error creating calico client: stat /etc/cni/net.d/calico-kubeconfig: no such file or directory
Warning Unhealthy 113m (x7 over 114m) kubelet Readiness probe failed: Error initializing datastore: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: connect: no route to host
When I shut down the firewall, the error disappeared, but I need to use it while the firewall is running
@willzhang do you use calico VPP?
@willzhang do you use calico VPP?
no, just calico ipip with helm install.
@willzhang could you provide any logs from the failing pods? Why they are failing? It does not seem obvious why irq affinity would have such an effect, but perhaps some misconfiguration of network devices? Are you using some overlay? Are queues on the overlay assigned properly? I think vxlan.calico has a single queue only.
Solved the problem by reinstalling the OS system.