内核4.x或5.x时使用calico会出现报错
k8s node,k8s version: v1.20.6,网络是calico.yaml,节点的系统是centos7,升级内核到4.x或者5.x,CPU就会消耗10%左右,同时出现一些报错,不升级内核是没有的
pod所在节点
[root@devops ~]# kubectl get pods calico-node-cvmhh -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-node-cvmhh 1/1 Running 0 29h 192.1.0.39 k8s-v1206-node-03
pod所在节点的内核版本
[root@k8s-v1206-node-03 ~]# uname -a Linux k8s-v1206-node-03 4.19.113-300.el7.x86_64 #1 SMP Mon Mar 30 21:50:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
pod日志
[root@devops ~]# kubectl logs calico-node-cvmhh -n kube-system --tail=100
2021-10-06 17:27:54.774 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=
监控
可以看到NetworkManager占到7%左右,这个是不正常的 PID CID SYSCPU USRCPU RDELAY VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/5 767 host-------- 2.92s 3.81s 7.00s 537.6M 13908K 9972K 12K root root N- - 3 S 0 7% NetworkManager 1 host-------- 2.41s 0.45s 2.37s 125.3M 8408K 271.4M 288.8M root root N- - 1 S 1 3% systemd 1029 host-------- 1.63s 1.10s 5.75s 1.5G 94276K 128.0M 124K root root N- - 15 S 0 3% kubelet 1110 host-------- 1.42s 0.98s 5.22s 1.2G 89056K 91520K 6316K root root N- - 18 S 0 2% dockerd 760 host-------- 0.54s 1.19s 2.39s 66636K 4588K 924K 0K dbus dbus N- - 2 S 0 2% dbus-daemon 628 host-------- 1.07s 0.21s 3.25s 49220K 7088K 15264K 116K root root N- - 1 S 1 1% systemd-udevd 601 host-------- 0.55s 0.19s 1.51s 39452K 7200K 1036K 0K root root N- - 1 S 0 1% systemd-journa 18128 2d43e3a75772 0.32s 0.39s 2.22s 1.4G 59468K 0K 28K root root N- - 11 S 0 1% calico-node 1038 host-------- 0.43s 0.23s 1.98s 1.0G 44608K 47252K 1964K root root N- - 9 S 0 1% containerd
内核切换为3.x后
[root@k8s-v1206-node-03 ~]# uname -a Linux k8s-v1206-node-03 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
删除旧的pod
[root@devops ~]# kubectl delete pods calico-node-cvmhh -n kube-system pod "calico-node-cvmhh" deleted
新pod,fail消失了
[root@devops ~]# kubectl get pods calico-node-qdq7c -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-node-qdq7c 1/1 Running 1 4m7s 192.1.0.39 k8s-v1206-node-03
[root@devops ~]# kubectl logs calico-node-qdq7c -n kube-system --tail=100 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/24 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/24 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/24 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/32 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/32 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/32 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Graceful restart done bird: Mesh_192_1_0_34: State changed to feed bird: Mesh_192_1_0_37: State changed to feed bird: Mesh_192_1_0_38: State changed to feed bird: Mesh_192_1_0_34: State changed to up bird: Mesh_192_1_0_37: State changed to up bird: Mesh_192_1_0_38: State changed to up 2021-10-06 17:37:26.234 [INFO][62] felix/health.go 196: Overall health status changed newStatus=&health.HealthReport{Live:true, Ready:true} 2021-10-06 17:37:29.973 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:29.973 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:29.974 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.985 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:29.985 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:39.287 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:39.287 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:39.287 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.293 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:39.293 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:51.113 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:37:51.113 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:37:51.113 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.120 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:51.120 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:37:52.162 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:52.162 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:52.162 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.172 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:52.172 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:58.961 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:58.961 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:58.961 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.967 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:58.967 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:38:04.344 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:38:04.345 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:38:04.345 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.345 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:04.345 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.346 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.361 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:38:04.361 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:38:05.252 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:38:05.252 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:38:05.252 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.253 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:05.253 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:05.253 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.253 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:38:05.254 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.254 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.259 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:38:05.259 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:38:21.647 [INFO][50] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface ens33: 192.1.0.39/16 2021-10-06 17:38:24.756 [INFO][62] felix/summary.go 100: Summarising 41 dataplane reconciliation loops over 1m3.1s: avg=11ms longest=240ms (resync-filter-v4,resync-ipsets-v4,resync-mangle-v4,resync-nat-v4,resync-raw-v4,resync-routes-v4,resync-routes-v4,resync-rules-v4,update-filter-v4,update-ipsets-4,update-mangle-v4,update-nat-v4,update-raw-v4) 2021-10-06 17:39:21.649 [INFO][50] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface ens33: 192.1.0.39/16
监控
9868 dfc6a1a4eb0b 7.14s 2.12s 3.96s 1.6G 46596K 10620K 4K root root N- - 13 S 1 2% calico-node 960 host-------- 5.82s 2.93s 9.78s 1.3G 79780K 167.6M 128K root root N- - 16 S 0 2% kubelet 1091 host-------- 2.48s 1.96s 8.79s 1.1G 78124K 88728K 2660K root root N- - 11 S 0 1% dockerd 965 host-------- 1.80s 0.83s 5.03s 1.0G 42212K 47420K 1684K root root N- - 9 S 0 1% containerd 1 host-------- 1.63s 0.33s 0.36s 125.4M 7064K 326.9M 162.6M root root N- - 1 S 1 0% systemd 536 host-------- 0.22s 0.79s 0.49s 50112K 6632K 15044K 0K root root N- - 1 S 0 0% systemd-udevd 10652 54cf7a7b7c43 0.91s 0.07s 1.10s 30900K 13144K 0K 0K root root N- - 7 S 1 0% nfs-client-pro 9 host-------- 0.66s 0.00s 5.02s 0K 0K 0K 0K root root N- - 1 S 1 0% rcu_sched 9712 host-------- 0.33s 0.11s 1.35s 696.4M 13720K 4K 0K root root N- - 11 S 0 0% containerd-shi 1006 host-------- 0.43s 0.00s 0.53s 123.4M 1692K 172K 36K root root N- - 1 S 0 0% crond 10395 host-------- 0.32s 0.09s 0.80s 696.4M 11992K 0K 0K root root N- - 11 S 0 0% containerd-shi 10414 30ad04d4a575 0.21s 0.11s 0.18s 729.8M 32068K 0K 16K polkitd polkitd N- - 5 S 1 0% kube-controlle 499 host-------- 0.27s 0.05s 0.36s 39448K 3720K 1036K 0K root root N- - 1 S 1 0% systemd-journa
@sridhartigera Could you help to take a look? Felix hit error on wipe the XDP state error= after kernel upgrade.
@mazdakn @neiljerram Looks like some xdp failures. Can you PTAL?
It is a centos7 system. The error occurred after upgrading kernel to 4.x or 5.x. CPU usage went up to 10%.
kernel version before upgrade.
[root@k8s-v1206-node-03 ~]# uname -a
Linux k8s-v1206-node-03 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
kernel version after upgrade
[root@k8s-v1206-node-03 ~]# uname -a
Linux k8s-v1206-node-03 4.19.113-300.el7.x86_64 #1 SMP Mon Mar 30 21:50:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
@sridhartigera Please note:
2021-10-06 17:27:55.561 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
It looks the reporter has an interface named calico_tmp_A. Didn't you recently see a problem where it was necessary to remove certain special characters when converting from the interface name to the corresponding file name in the bpffs?
@PayneRose You may like to try again with interface names without special characters. For example "calicotmpA" and "calicotmpB" instead of "calico_tmp_A" and "calico_tmp_B".
@PayneRose ping - any update on this?
Anyone found the problem ? Seems related to SELinux from what I see in audit.log.
@quick691fr Please do feel free to explain more!
@neiljerram
I have calico installed in a k8s cluster installation on CentOS nodes deployed from my rancher. Here are part of the contineous logs (every 130ms) from the k8s_calico-node_canal-42f4n_kube-system_fe2012fc-ab67-4bc0-a083-3ff6cb3df3e9_0 container :
[root@wrkr-1 centos]# docker logs `docker ps | grep calico | cut -d ' ' -f 1`
2022-03-21 14:43:53.321 [WARNING][40] felix/int_dataplane.go 1394: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program.
libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map.
libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13)
libbpf: failed to load object '/usr/lib/calico/bpf/filter.o'
Error: failed to load object file
try=3
2022-03-21 14:43:53.454 [WARNING][40] felix/int_dataplane.go 1394: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program.
libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map.
libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13)
libbpf: failed to load object '/usr/lib/calico/bpf/filter.o'
Error: failed to load object file
try=4
Since it's a permission denied error, I checked the audit.log related to SeLinux, here are the conitenous logs (also every 130ms) :
[root@wrkr-1 centos]# tail -f /var/log/audit/audit.log
type=AVC msg=audit(1647949699.710:2927154): avc: denied { map_create } for pid=29339 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927155): avc: denied { prog_load } for pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927156): avc: denied { map_create } for pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927157): avc: denied { map_create } for pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
Seeing these denied errors for bpftool in audit.log, I tries to change context of the /sys/fs/bpf/calico/xdp/ path but seems a virtual path and I'm ot able to change context.
The problem is my calico container is filling log file and then filesystem and after days, my node has a no more space available error and stops working.
@neiljerram ping
@sridhartigera ping ?
@quick691fr I'm afraid I don't know. Please can you ask for help in the SELinux community?
same error.
2024-05-13T10:27:51.097729825+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=<nil> ifaceName="calico_tmp_A"
2024-05-13T10:27:51.097732276+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=4052912 ifaceName="calico_tmp_B" state=""
2024-05-13T10:27:51.097734349+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=<nil> ifaceName="calico_tmp_B"
2024-05-13T10:27:51.104990159+08:00 stdout F 2024-05-13 02:27:51.103 [INFO][69] felix/int_dataplane.go 2154: Applying XDP actions did not succeed, disabling XDP error=failed to resync: failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
2024-05-13T10:27:51.105000003+08:00 stdout F libbpf: Error loading BTF: Invalid argument(22)
2024-05-13T10:27:51.105002529+08:00 stdout F libbpf: magic: 0xeb9f
2024-05-13T10:27:51.105004766+08:00 stdout F version: 1
2024-05-13T10:27:51.10500674+08:00 stdout F flags: 0x0
2024-05-13T10:27:51.105008777+08:00 stdout F hdr_len: 24
2024-05-13T10:27:51.105010677+08:00 stdout F type_off: 0
2024-05-13T10:27:51.105012721+08:00 stdout F type_len: 936
2024-05-13T10:27:51.105014707+08:00 stdout F str_off: 936
2024-05-13T10:27:51.105016551+08:00 stdout F str_len: 1142
2024-05-13T10:27:51.105018656+08:00 stdout F btf_total_size: 2102
2024-05-13T10:27:51.105020527+08:00 stdout F [1] PTR (anon) type_id=3
2024-05-13T10:27:51.105022656+08:00 stdout F [2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
2024-05-13T10:27:51.105024618+08:00 stdout F [3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1
2024-05-13T10:27:51.105026643+08:00 stdout F [4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
2024-05-13T10:27:51.105028545+08:00 stdout F [5] PTR (anon) type_id=6
2024-05-13T10:27:51.105030443+08:00 stdout F [6] STRUCT protoport size=4 vlen=2
2024-05-13T10:27:51.105032886+08:00 stdout F proto type_id=7 bits_offset=0
2024-05-13T10:27:51.105034792+08:00 stdout F port type_id=7 bits_offset=16
2024-05-13T10:27:51.105036678+08:00 stdout F [7] TYPEDEF __u16 type_id=8
2024-05-13T10:27:51.105038611+08:00 stdout F [8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
2024-05-13T10:27:51.105040549+08:00 stdout F [9] PTR (anon) type_id=10
2024-05-13T10:27:51.105042368+08:00 stdout F [10] TYPEDEF __u32 type_id=11
2024-05-13T10:27:51.105044508+08:00 stdout F [11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
2024-05-13T10:27:51.105056207+08:00 stdout F [12] PTR (anon) type_id=13
2024-05-13T10:27:51.105058304+08:00 stdout F [13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535
2024-05-13T10:27:51.105062841+08:00 stdout F [14] STRUCT (anon) size=40 vlen=5
2024-05-13T10:27:51.105064947+08:00 stdout F type type_id=1 bits_offset=0
2024-05-13T10:27:51.105066787+08:00 stdout F key type_id=5 bits_offset=64
2024-05-13T10:27:51.10506869+08:00 stdout F value type_id=9 bits_offset=128
2024-05-13T10:27:51.105070562+08:00 stdout F max_entries type_id=12 bits_offset=192
2024-05-13T10:27:51.105072406+08:00 stdout F map_flags type_id=1 bits_offset=256
2024-05-13T10:27:51.105074477+08:00 stdout F [15] VAR calico_failsafe_ports type_id=14 linkage=1
2024-05-13T10:27:51.105076468+08:00 stdout F [16] PTR (anon) type_id=17
2024-05-13T10:27:51.105078329+08:00 stdout F [17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11
2024-05-13T10:27:51.10508018+08:00 stdout F [18] PTR (anon) type_id=19
2024-05-13T10:27:51.10508205+08:00 stdout F [19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2
2024-05-13T10:27:51.105084247+08:00 stdout F lpm type_id=20 bits_offset=0
2024-05-13T10:27:51.105086183+08:00 stdout F ip type_id=24 bits_offset=0
2024-05-13T10:27:51.105088025+08:00 stdout F [20] STRUCT bpf_lpm_trie_key size=4 vlen=2
2024-05-13T10:27:51.105089881+08:00 stdout F prefixlen type_id=10 bits_offset=0
2024-05-13T10:27:51.105091823+08:00 stdout F data type_id=23 bits_offset=32
2024-05-13T10:27:51.105093644+08:00 stdout F [21] TYPEDEF __u8 type_id=22
2024-05-13T10:27:51.105095539+08:00 stdout F [22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
2024-05-13T10:27:51.105097464+08:00 stdout F [23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0
2024-05-13T10:27:51.105099413+08:00 stdout F [24] STRUCT ip4key size=8 vlen=2
2024-05-13T10:27:51.105101317+08:00 stdout F mask type_id=10 bits_offset=0
2024-05-13T10:27:51.105103155+08:00 stdout F addr type_id=10 bits_offset=32
2024-05-13T10:27:51.105104951+08:00 stdout F [25] PTR (anon) type_id=26
2024-05-13T10:27:51.105108215+08:00 stdout F [26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240
2024-05-13T10:27:51.105110118+08:00 stdout F [27] STRUCT (anon) size=40 vlen=5
2024-05-13T10:27:51.105112019+08:00 stdout F type type_id=16 bits_offset=0
2024-05-13T10:27:51.105113812+08:00 stdout F key type_id=18 bits_offset=64
2024-05-13T10:27:51.105115745+08:00 stdout F value type_id=9 bits_offset=128
2024-05-13T10:27:51.105117542+08:00 stdout F max_entries type_id=25 bits_offset=192
2024-05-13T10:27:51.105119336+08:00 stdout F map_flags type_id=1 bits_offset=256
2024-05-13T10:27:51.105121317+08:00 stdout F [28] VAR calico_prefilter_v4 type_id=27 linkage=1
2024-05-13T10:27:51.105123187+08:00 stdout F [29] PTR (anon) type_id=30
2024-05-13T10:27:51.105124976+08:00 stdout F [30] STRUCT xdp_md size=24 vlen=6
2024-05-13T10:27:51.10512682+08:00 stdout F data type_id=10 bits_offset=0
2024-05-13T10:27:51.105128732+08:00 stdout F data_end type_id=10 bits_offset=32
2024-05-13T10:27:51.105130611+08:00 stdout F data_meta type_id=10 bits_offset=64
2024-05-13T10:27:51.105132424+08:00 stdout F ingress_ifindex type_id=10 bits_offset=96
2024-05-13T10:27:51.105134199+08:00 stdout F rx_queue_index type_id=10 bits_offset=128
2024-05-13T10:27:51.105135986+08:00 stdout F egress_ifindex type_id=10 bits_offset=160
2024-05-13T10:27:51.105137862+08:00 stdout F [31] FUNC_PROTO (anon) return=32 args=(29 xdp)
2024-05-13T10:27:51.105139646+08:00 stdout F [32] ENUM xdp_action size=4 vlen=5
2024-05-13T10:27:51.105141569+08:00 stdout F XDP_ABORTED val=0
2024-05-13T10:27:51.105143442+08:00 stdout F XDP_DROP val=1
2024-05-13T10:27:51.10514533+08:00 stdout F XDP_PASS val=2
2024-05-13T10:27:51.105147193+08:00 stdout F XDP_TX val=3
2024-05-13T10:27:51.105149013+08:00 stdout F XDP_REDIRECT val=4
2024-05-13T10:27:51.105150841+08:00 stdout F [33] FUNC prefilter type_id=31 vlen != 0
2024-05-13T10:27:51.105152555+08:00 stdout F
2024-05-13T10:27:51.105154458+08:00 stdout F libbpf: Error loading .BTF into kernel: -22.
2024-05-13T10:27:51.10515903+08:00 stdout F Error: failed to open object file
2024-05-13T10:27:51.105160817+08:00 stdout F
OS: centos 7 (KERNEL 5.4 ) k8s: 1.29.4 calico: 3.27.3
logs: calico-node.log
@codering to me it looks like you are dealing with another error, so it's better to open a new issue.
@quick691fr could you find the issue about SELinux?
I also encountered the same problem. The log file has been printed for 143G and has been repeating the following logs.
libbpf: Error loading .BTF into kernel: -22.
Error: failed to open object file
try=1
2024-05-26 10:15:17.684 [WARNING][100656] felix/int_dataplane.go 1822: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 936
str_off: 936
str_len: 1142
btf_total_size: 2102
[1] PTR (anon) type_id=3
[2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1
[4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] PTR (anon) type_id=6
[6] STRUCT protoport size=4 vlen=2
proto type_id=7 bits_offset=0
port type_id=7 bits_offset=16
[7] TYPEDEF __u16 type_id=8
[8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
[9] PTR (anon) type_id=10
[10] TYPEDEF __u32 type_id=11
[11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[12] PTR (anon) type_id=13
[13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535
[14] STRUCT (anon) size=40 vlen=5
type type_id=1 bits_offset=0
key type_id=5 bits_offset=64
value type_id=9 bits_offset=128
max_entries type_id=12 bits_offset=192
map_flags type_id=1 bits_offset=256
[15] VAR calico_failsafe_ports type_id=14 linkage=1
[16] PTR (anon) type_id=17
[17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11
[18] PTR (anon) type_id=19
[19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2
lpm type_id=20 bits_offset=0
ip type_id=24 bits_offset=0
[20] STRUCT bpf_lpm_trie_key size=4 vlen=2
prefixlen type_id=10 bits_offset=0
data type_id=23 bits_offset=32
[21] TYPEDEF __u8 type_id=22
[22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
[23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0
[24] STRUCT ip4key size=8 vlen=2
mask type_id=10 bits_offset=0
addr type_id=10 bits_offset=32
[25] PTR (anon) type_id=26
[26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240
[27] STRUCT (anon) size=40 vlen=5
type type_id=16 bits_offset=0
key type_id=18 bits_offset=64
value type_id=9 bits_offset=128
max_entries type_id=25 bits_offset=192
map_flags type_id=1 bits_offset=256
[28] VAR calico_prefilter_v4 type_id=27 linkage=1
[29] PTR (anon) type_id=30
[30] STRUCT xdp_md size=24 vlen=6
data type_id=10 bits_offset=0
data_end type_id=10 bits_offset=32
data_meta type_id=10 bits_offset=64
ingress_ifindex type_id=10 bits_offset=96
rx_queue_index type_id=10 bits_offset=128
egress_ifindex type_id=10 bits_offset=160
[31] FUNC_PROTO (anon) return=32 args=(29 xdp)
[32] ENUM xdp_action size=4 vlen=5
XDP_ABORTED val=0
XDP_DROP val=1
XDP_PASS val=2
XDP_TX val=3
XDP_REDIRECT val=4
[33] FUNC prefilter type_id=31 vlen != 0
K8s version, Deploying using kubeadm
Client Version: v1.29.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
Calico version: v3.27.3 Deploying using yaml https://docs.tigera.io/calico/3.27/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less
os system: ubuntu kernel: 5.4.0-100-generic
Hi! Getting the same error when installing on Ubuntu 20.04, with linux 5.4.0-110-generic. Had to downgrade from calico 3.28 down to 3.26.4 to get something working.
so,how to fix this` problem,i have the same problem
I also encountered the same problem. The log file has been printed for 143G and has been repeating the following logs.
libbpf: Error loading .BTF into kernel: -22. Error: failed to open object file try=1 2024-05-26 10:15:17.684 [WARNING][100656] felix/int_dataplane.go 1822: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error loading BTF: Invalid argument(22) libbpf: magic: 0xeb9f version: 1 flags: 0x0 hdr_len: 24 type_off: 0 type_len: 936 str_off: 936 str_len: 1142 btf_total_size: 2102 [1] PTR (anon) type_id=3 [2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1 [4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none) [5] PTR (anon) type_id=6 [6] STRUCT protoport size=4 vlen=2 proto type_id=7 bits_offset=0 port type_id=7 bits_offset=16 [7] TYPEDEF __u16 type_id=8 [8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none) [9] PTR (anon) type_id=10 [10] TYPEDEF __u32 type_id=11 [11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none) [12] PTR (anon) type_id=13 [13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535 [14] STRUCT (anon) size=40 vlen=5 type type_id=1 bits_offset=0 key type_id=5 bits_offset=64 value type_id=9 bits_offset=128 max_entries type_id=12 bits_offset=192 map_flags type_id=1 bits_offset=256 [15] VAR calico_failsafe_ports type_id=14 linkage=1 [16] PTR (anon) type_id=17 [17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11 [18] PTR (anon) type_id=19 [19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2 lpm type_id=20 bits_offset=0 ip type_id=24 bits_offset=0 [20] STRUCT bpf_lpm_trie_key size=4 vlen=2 prefixlen type_id=10 bits_offset=0 data type_id=23 bits_offset=32 [21] TYPEDEF __u8 type_id=22 [22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none) [23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0 [24] STRUCT ip4key size=8 vlen=2 mask type_id=10 bits_offset=0 addr type_id=10 bits_offset=32 [25] PTR (anon) type_id=26 [26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240 [27] STRUCT (anon) size=40 vlen=5 type type_id=16 bits_offset=0 key type_id=18 bits_offset=64 value type_id=9 bits_offset=128 max_entries type_id=25 bits_offset=192 map_flags type_id=1 bits_offset=256 [28] VAR calico_prefilter_v4 type_id=27 linkage=1 [29] PTR (anon) type_id=30 [30] STRUCT xdp_md size=24 vlen=6 data type_id=10 bits_offset=0 data_end type_id=10 bits_offset=32 data_meta type_id=10 bits_offset=64 ingress_ifindex type_id=10 bits_offset=96 rx_queue_index type_id=10 bits_offset=128 egress_ifindex type_id=10 bits_offset=160 [31] FUNC_PROTO (anon) return=32 args=(29 xdp) [32] ENUM xdp_action size=4 vlen=5 XDP_ABORTED val=0 XDP_DROP val=1 XDP_PASS val=2 XDP_TX val=3 XDP_REDIRECT val=4 [33] FUNC prefilter type_id=31 vlen != 0K8s version, Deploying using kubeadm
Client Version: v1.29.5 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.4Calico version: v3.27.3 Deploying using yaml https://docs.tigera.io/calico/3.27/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less
os system: ubuntu kernel: 5.4.0-100-generic
My solution is to disable xdpEnabled
The specific operation is as follows:
sudo kubectl edit felixconfigurations default
apiVersion: crd.projectcalico.org/v1
kind: FelixConfiguration
metadata:
annotations:
generation: 2
name: default
spec:
bpfConnectTimeLoadBalancing: TCP
bpfHostNetworkedNATWithoutCTLB: Enabled
bpfLogLevel: ""
floatingIPs: Disabled
logSeverityScreen: Info
reportingInterval: 0s
# Add the following content
xdpEnabled: false
# Delete pods of Calico nodes, do not delete in batches, delete in batches
sudo kubectl delete pod calico-node-hkgc8 -n kube-system
so,how to fix this` problem,i have the same problem
Go downstairs and reply
@huguesgr @limylily @limylily have you tried the latest 3.27 patch release?
We recently upgrade bpftool to v7.4 to fix this GH issue: https://github.com/projectcalico/calico/issues/8856
It seems similar to some of the ones discussed here. The fix will be available in the next 3.28 patch release which will be cut soon.
Hi @mazdakn, it seems fine with 3.27.4 indeed, thanks!