calico icon indicating copy to clipboard operation
calico copied to clipboard

内核4.x或5.x时使用calico会出现报错

Open pangfaheng opened this issue 4 years ago • 16 comments

k8s node,k8s version: v1.20.6,网络是calico.yaml,节点的系统是centos7,升级内核到4.x或者5.x,CPU就会消耗10%左右,同时出现一些报错,不升级内核是没有的

pod所在节点

[root@devops ~]# kubectl get pods calico-node-cvmhh -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-node-cvmhh 1/1 Running 0 29h 192.1.0.39 k8s-v1206-node-03

pod所在节点的内核版本

[root@k8s-v1206-node-03 ~]# uname -a Linux k8s-v1206-node-03 4.19.113-300.el7.x86_64 #1 SMP Mon Mar 30 21:50:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

pod日志

[root@devops ~]# kubectl logs calico-node-cvmhh -n kube-system --tail=100 2021-10-06 17:27:54.774 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:54.774 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:54.807 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=1 2021-10-06 17:27:54.895 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:54.896 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:54.902 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:54.902 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:54.938 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=2 2021-10-06 17:27:55.029 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.029 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.034 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.034 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.065 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=3 2021-10-06 17:27:55.157 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.157 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.160 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.160 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.187 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=4 2021-10-06 17:27:55.278 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.278 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.282 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.282 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.314 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=5 2021-10-06 17:27:55.403 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.403 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.408 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.408 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.436 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=6 2021-10-06 17:27:55.528 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.528 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.532 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.532 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" bird: Netlink: File exists bird: Netlink: File exists bird: ... 2021-10-06 17:27:55.561 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=7 2021-10-06 17:27:55.651 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.651 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.655 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.655 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.687 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=8 2021-10-06 17:27:55.778 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B" 2021-10-06 17:27:55.779 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A" 2021-10-06 17:27:55.783 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_B" 2021-10-06 17:27:55.783 [INFO][51] felix/int_dataplane.go 1060: Linux interface addrs changed. addrs= ifaceName="calico_tmp_A" 2021-10-06 17:27:55.812 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program. libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map. libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13) libbpf: failed to load object '/usr/lib/calico/bpf/filter.o' Error: failed to load object file try=9

监控

可以看到NetworkManager占到7%左右,这个是不正常的 PID CID SYSCPU USRCPU RDELAY VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/5 767 host-------- 2.92s 3.81s 7.00s 537.6M 13908K 9972K 12K root root N- - 3 S 0 7% NetworkManager 1 host-------- 2.41s 0.45s 2.37s 125.3M 8408K 271.4M 288.8M root root N- - 1 S 1 3% systemd 1029 host-------- 1.63s 1.10s 5.75s 1.5G 94276K 128.0M 124K root root N- - 15 S 0 3% kubelet 1110 host-------- 1.42s 0.98s 5.22s 1.2G 89056K 91520K 6316K root root N- - 18 S 0 2% dockerd 760 host-------- 0.54s 1.19s 2.39s 66636K 4588K 924K 0K dbus dbus N- - 2 S 0 2% dbus-daemon 628 host-------- 1.07s 0.21s 3.25s 49220K 7088K 15264K 116K root root N- - 1 S 1 1% systemd-udevd 601 host-------- 0.55s 0.19s 1.51s 39452K 7200K 1036K 0K root root N- - 1 S 0 1% systemd-journa 18128 2d43e3a75772 0.32s 0.39s 2.22s 1.4G 59468K 0K 28K root root N- - 11 S 0 1% calico-node 1038 host-------- 0.43s 0.23s 1.98s 1.0G 44608K 47252K 1964K root root N- - 9 S 0 1% containerd

内核切换为3.x后

[root@k8s-v1206-node-03 ~]# uname -a Linux k8s-v1206-node-03 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

删除旧的pod

[root@devops ~]# kubectl delete pods calico-node-cvmhh -n kube-system pod "calico-node-cvmhh" deleted

新pod,fail消失了

[root@devops ~]# kubectl get pods calico-node-qdq7c -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-node-qdq7c 1/1 Running 1 4m7s 192.1.0.39 k8s-v1206-node-03

[root@devops ~]# kubectl logs calico-node-qdq7c -n kube-system --tail=100 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/24 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/24 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/24 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/32 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Next hop address 20.0.0.0 resolvable through recursive route for 20.0.0.0/32 bird: Next hop address 20.0.1.0 resolvable through recursive route for 20.0.1.0/32 bird: Next hop address 20.0.3.0 resolvable through recursive route for 20.0.3.0/24 bird: Next hop address 20.0.2.0 resolvable through recursive route for 20.0.2.0/32 bird: Graceful restart done bird: Mesh_192_1_0_34: State changed to feed bird: Mesh_192_1_0_37: State changed to feed bird: Mesh_192_1_0_38: State changed to feed bird: Mesh_192_1_0_34: State changed to up bird: Mesh_192_1_0_37: State changed to up bird: Mesh_192_1_0_38: State changed to up 2021-10-06 17:37:26.234 [INFO][62] felix/health.go 196: Overall health status changed newStatus=&health.HealthReport{Live:true, Ready:true} 2021-10-06 17:37:29.973 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:29.973 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:29.974 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:29.974 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.974 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:29.985 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:29.985 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:39.287 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:39.287 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:39.287 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:39.287 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.287 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:39.293 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:39.293 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:51.113 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:37:51.113 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:37:51.113 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:37:51.113 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.113 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:37:51.120 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:51.120 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:37:52.162 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:52.162 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:52.162 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:52.162 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.162 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:52.172 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:52.172 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:37:58.961 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=kube-system/calico-kube-controllers-855445d444-bl2bn, name=eth0) 2021-10-06 17:37:58.961 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"kube-system/calico-kube-controllers-855445d444-bl2bn" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali092a3b8dbd6" profile_ids:"kns.kube-system" profile_ids:"ksa.kube-system.calico-kube-controllers" ipv4_nets:"20.159.149.77/32" > 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:58.961 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali092a3b8dbd6" ipVersion=0x4 table="filter" 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali092a3b8dbd6" 2021-10-06 17:37:58.961 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.961 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} 2021-10-06 17:37:58.967 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:37:58.967 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"kube-system/calico-kube-controllers-855445d444-bl2bn", EndpointId:"eth0"} status="up" 2021-10-06 17:38:04.344 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:38:04.345 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:38:04.345 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.345 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:04.345 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:38:04.346 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.346 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:04.361 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:38:04.361 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:38:05.252 [INFO][62] felix/calc_graph.go 445: Local endpoint updated id=WorkloadEndpoint(node=k8s-v1206-node-03, orchestrator=k8s, workload=nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8, name=eth0) 2021-10-06 17:38:05.252 [INFO][62] felix/int_dataplane.go 1484: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"k8s" workload_id:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali8728f271cce" profile_ids:"kns.nfs" profile_ids:"ksa.nfs.nfs-client-nfs-client-provisioner" ipv4_nets:"20.159.149.78/32" > 2021-10-06 17:38:05.252 [INFO][62] felix/endpoint_mgr.go 583: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.253 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-tw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:05.253 [INFO][62] felix/table.go 506: Queueing update of chain. chainName="cali-fw-cali8728f271cce" ipVersion=0x4 table="filter" 2021-10-06 17:38:05.253 [INFO][62] felix/endpoint_mgr.go 614: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.253 [INFO][62] felix/endpoint_mgr.go 1145: Applying /proc/sys configuration to interface. ifaceName="cali8728f271cce" 2021-10-06 17:38:05.254 [INFO][62] felix/endpoint_mgr.go 476: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="up" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.254 [INFO][62] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} 2021-10-06 17:38:05.259 [INFO][62] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} ipVersion=0x4 status="up" 2021-10-06 17:38:05.259 [INFO][62] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"nfs/nfs-client-nfs-client-provisioner-5c5bc74c5d-kgxx8", EndpointId:"eth0"} status="up" 2021-10-06 17:38:21.647 [INFO][50] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface ens33: 192.1.0.39/16 2021-10-06 17:38:24.756 [INFO][62] felix/summary.go 100: Summarising 41 dataplane reconciliation loops over 1m3.1s: avg=11ms longest=240ms (resync-filter-v4,resync-ipsets-v4,resync-mangle-v4,resync-nat-v4,resync-raw-v4,resync-routes-v4,resync-routes-v4,resync-rules-v4,update-filter-v4,update-ipsets-4,update-mangle-v4,update-nat-v4,update-raw-v4) 2021-10-06 17:39:21.649 [INFO][50] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface ens33: 192.1.0.39/16

监控

9868 dfc6a1a4eb0b 7.14s 2.12s 3.96s 1.6G 46596K 10620K 4K root root N- - 13 S 1 2% calico-node 960 host-------- 5.82s 2.93s 9.78s 1.3G 79780K 167.6M 128K root root N- - 16 S 0 2% kubelet 1091 host-------- 2.48s 1.96s 8.79s 1.1G 78124K 88728K 2660K root root N- - 11 S 0 1% dockerd 965 host-------- 1.80s 0.83s 5.03s 1.0G 42212K 47420K 1684K root root N- - 9 S 0 1% containerd 1 host-------- 1.63s 0.33s 0.36s 125.4M 7064K 326.9M 162.6M root root N- - 1 S 1 0% systemd 536 host-------- 0.22s 0.79s 0.49s 50112K 6632K 15044K 0K root root N- - 1 S 0 0% systemd-udevd 10652 54cf7a7b7c43 0.91s 0.07s 1.10s 30900K 13144K 0K 0K root root N- - 7 S 1 0% nfs-client-pro 9 host-------- 0.66s 0.00s 5.02s 0K 0K 0K 0K root root N- - 1 S 1 0% rcu_sched 9712 host-------- 0.33s 0.11s 1.35s 696.4M 13720K 4K 0K root root N- - 11 S 0 0% containerd-shi 1006 host-------- 0.43s 0.00s 0.53s 123.4M 1692K 172K 36K root root N- - 1 S 0 0% crond 10395 host-------- 0.32s 0.09s 0.80s 696.4M 11992K 0K 0K root root N- - 11 S 0 0% containerd-shi 10414 30ad04d4a575 0.21s 0.11s 0.18s 729.8M 32068K 0K 16K polkitd polkitd N- - 5 S 1 0% kube-controlle 499 host-------- 0.27s 0.05s 0.36s 39448K 3720K 1036K 0K root root N- - 1 S 1 0% systemd-journa

pangfaheng avatar Oct 06 '21 17:10 pangfaheng

@sridhartigera Could you help to take a look? Felix hit error on wipe the XDP state error= after kernel upgrade.

song-jiang avatar Nov 02 '21 16:11 song-jiang

@mazdakn @neiljerram Looks like some xdp failures. Can you PTAL?

sridhartigera avatar Nov 09 '21 22:11 sridhartigera

It is a centos7 system. The error occurred after upgrading kernel to 4.x or 5.x. CPU usage went up to 10%.

kernel version before upgrade.

[root@k8s-v1206-node-03 ~]# uname -a
Linux k8s-v1206-node-03 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

kernel version after upgrade

[root@k8s-v1206-node-03 ~]# uname -a
Linux k8s-v1206-node-03 4.19.113-300.el7.x86_64 #1 SMP Mon Mar 30 21:50:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

song-jiang avatar Nov 10 '21 11:11 song-jiang

@sridhartigera Please note:

2021-10-06 17:27:55.561 [WARNING][51] felix/int_dataplane.go 1431: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory

It looks the reporter has an interface named calico_tmp_A. Didn't you recently see a problem where it was necessary to remove certain special characters when converting from the interface name to the corresponding file name in the bpffs?

@PayneRose You may like to try again with interface names without special characters. For example "calicotmpA" and "calicotmpB" instead of "calico_tmp_A" and "calico_tmp_B".

nelljerram avatar Nov 10 '21 13:11 nelljerram

@PayneRose ping - any update on this?

caseydavenport avatar Dec 07 '21 22:12 caseydavenport

Anyone found the problem ? Seems related to SELinux from what I see in audit.log.

quick691fr avatar Mar 15 '22 11:03 quick691fr

@quick691fr Please do feel free to explain more!

nelljerram avatar Mar 17 '22 10:03 nelljerram

@neiljerram

I have calico installed in a k8s cluster installation on CentOS nodes deployed from my rancher. Here are part of the contineous logs (every 130ms) from the k8s_calico-node_canal-42f4n_kube-system_fe2012fc-ab67-4bc0-a083-3ff6cb3df3e9_0 container :

[root@wrkr-1 centos]# docker logs `docker ps | grep calico | cut -d ' ' -f 1`
2022-03-21 14:43:53.321 [WARNING][40] felix/int_dataplane.go 1394: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program.
libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map.
libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13)
libbpf: failed to load object '/usr/lib/calico/bpf/filter.o'
Error: failed to load object file
 try=3
2022-03-21 14:43:53.454 [WARNING][40] felix/int_dataplane.go 1394: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error in bpf_object__probe_name():Permission denied(13). Couldn't load basic 'r0 = 0' BPF program.
libbpf: Error in bpf_object__probe_global_data():Permission denied(13). Couldn't create simple array map.
libbpf: failed to create map (name: 'calico_prefilter_v4'): Permission denied(-13)
libbpf: failed to load object '/usr/lib/calico/bpf/filter.o'
Error: failed to load object file
 try=4

Since it's a permission denied error, I checked the audit.log related to SeLinux, here are the conitenous logs (also every 130ms) :

[root@wrkr-1 centos]# tail -f /var/log/audit/audit.log
type=AVC msg=audit(1647949699.710:2927154): avc:  denied  { map_create } for  pid=29339 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927155): avc:  denied  { prog_load } for  pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927156): avc:  denied  { map_create } for  pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0
type=AVC msg=audit(1647949699.730:2927157): avc:  denied  { map_create } for  pid=29350 comm="bpftool" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=bpf permissive=0

Seeing these denied errors for bpftool in audit.log, I tries to change context of the /sys/fs/bpf/calico/xdp/ path but seems a virtual path and I'm ot able to change context.

The problem is my calico container is filling log file and then filesystem and after days, my node has a no more space available error and stops working.

quick691fr avatar Mar 22 '22 11:03 quick691fr

@neiljerram ping

quick691fr avatar Apr 01 '22 14:04 quick691fr

@sridhartigera ping ?

quick691fr avatar Apr 11 '22 14:04 quick691fr

@quick691fr I'm afraid I don't know. Please can you ask for help in the SELinux community?

nelljerram avatar Apr 11 '22 14:04 nelljerram

same error.

2024-05-13T10:27:51.097729825+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=<nil> ifaceName="calico_tmp_A"
2024-05-13T10:27:51.097732276+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=4052912 ifaceName="calico_tmp_B" state=""
2024-05-13T10:27:51.097734349+08:00 stdout F 2024-05-13 02:27:51.097 [INFO][69] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=<nil> ifaceName="calico_tmp_B"
2024-05-13T10:27:51.104990159+08:00 stdout F 2024-05-13 02:27:51.103 [INFO][69] felix/int_dataplane.go 2154: Applying XDP actions did not succeed, disabling XDP error=failed to resync: failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
2024-05-13T10:27:51.105000003+08:00 stdout F libbpf: Error loading BTF: Invalid argument(22)
2024-05-13T10:27:51.105002529+08:00 stdout F libbpf: magic: 0xeb9f
2024-05-13T10:27:51.105004766+08:00 stdout F version: 1
2024-05-13T10:27:51.10500674+08:00 stdout F flags: 0x0
2024-05-13T10:27:51.105008777+08:00 stdout F hdr_len: 24
2024-05-13T10:27:51.105010677+08:00 stdout F type_off: 0
2024-05-13T10:27:51.105012721+08:00 stdout F type_len: 936
2024-05-13T10:27:51.105014707+08:00 stdout F str_off: 936
2024-05-13T10:27:51.105016551+08:00 stdout F str_len: 1142
2024-05-13T10:27:51.105018656+08:00 stdout F btf_total_size: 2102
2024-05-13T10:27:51.105020527+08:00 stdout F [1] PTR (anon) type_id=3
2024-05-13T10:27:51.105022656+08:00 stdout F [2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
2024-05-13T10:27:51.105024618+08:00 stdout F [3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1
2024-05-13T10:27:51.105026643+08:00 stdout F [4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
2024-05-13T10:27:51.105028545+08:00 stdout F [5] PTR (anon) type_id=6
2024-05-13T10:27:51.105030443+08:00 stdout F [6] STRUCT protoport size=4 vlen=2
2024-05-13T10:27:51.105032886+08:00 stdout F 	proto type_id=7 bits_offset=0
2024-05-13T10:27:51.105034792+08:00 stdout F 	port type_id=7 bits_offset=16
2024-05-13T10:27:51.105036678+08:00 stdout F [7] TYPEDEF __u16 type_id=8
2024-05-13T10:27:51.105038611+08:00 stdout F [8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
2024-05-13T10:27:51.105040549+08:00 stdout F [9] PTR (anon) type_id=10
2024-05-13T10:27:51.105042368+08:00 stdout F [10] TYPEDEF __u32 type_id=11
2024-05-13T10:27:51.105044508+08:00 stdout F [11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
2024-05-13T10:27:51.105056207+08:00 stdout F [12] PTR (anon) type_id=13
2024-05-13T10:27:51.105058304+08:00 stdout F [13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535
2024-05-13T10:27:51.105062841+08:00 stdout F [14] STRUCT (anon) size=40 vlen=5
2024-05-13T10:27:51.105064947+08:00 stdout F 	type type_id=1 bits_offset=0
2024-05-13T10:27:51.105066787+08:00 stdout F 	key type_id=5 bits_offset=64
2024-05-13T10:27:51.10506869+08:00 stdout F 	value type_id=9 bits_offset=128
2024-05-13T10:27:51.105070562+08:00 stdout F 	max_entries type_id=12 bits_offset=192
2024-05-13T10:27:51.105072406+08:00 stdout F 	map_flags type_id=1 bits_offset=256
2024-05-13T10:27:51.105074477+08:00 stdout F [15] VAR calico_failsafe_ports type_id=14 linkage=1
2024-05-13T10:27:51.105076468+08:00 stdout F [16] PTR (anon) type_id=17
2024-05-13T10:27:51.105078329+08:00 stdout F [17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11
2024-05-13T10:27:51.10508018+08:00 stdout F [18] PTR (anon) type_id=19
2024-05-13T10:27:51.10508205+08:00 stdout F [19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2
2024-05-13T10:27:51.105084247+08:00 stdout F 	lpm type_id=20 bits_offset=0
2024-05-13T10:27:51.105086183+08:00 stdout F 	ip type_id=24 bits_offset=0
2024-05-13T10:27:51.105088025+08:00 stdout F [20] STRUCT bpf_lpm_trie_key size=4 vlen=2
2024-05-13T10:27:51.105089881+08:00 stdout F 	prefixlen type_id=10 bits_offset=0
2024-05-13T10:27:51.105091823+08:00 stdout F 	data type_id=23 bits_offset=32
2024-05-13T10:27:51.105093644+08:00 stdout F [21] TYPEDEF __u8 type_id=22
2024-05-13T10:27:51.105095539+08:00 stdout F [22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
2024-05-13T10:27:51.105097464+08:00 stdout F [23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0
2024-05-13T10:27:51.105099413+08:00 stdout F [24] STRUCT ip4key size=8 vlen=2
2024-05-13T10:27:51.105101317+08:00 stdout F 	mask type_id=10 bits_offset=0
2024-05-13T10:27:51.105103155+08:00 stdout F 	addr type_id=10 bits_offset=32
2024-05-13T10:27:51.105104951+08:00 stdout F [25] PTR (anon) type_id=26
2024-05-13T10:27:51.105108215+08:00 stdout F [26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240
2024-05-13T10:27:51.105110118+08:00 stdout F [27] STRUCT (anon) size=40 vlen=5
2024-05-13T10:27:51.105112019+08:00 stdout F 	type type_id=16 bits_offset=0
2024-05-13T10:27:51.105113812+08:00 stdout F 	key type_id=18 bits_offset=64
2024-05-13T10:27:51.105115745+08:00 stdout F 	value type_id=9 bits_offset=128
2024-05-13T10:27:51.105117542+08:00 stdout F 	max_entries type_id=25 bits_offset=192
2024-05-13T10:27:51.105119336+08:00 stdout F 	map_flags type_id=1 bits_offset=256
2024-05-13T10:27:51.105121317+08:00 stdout F [28] VAR calico_prefilter_v4 type_id=27 linkage=1
2024-05-13T10:27:51.105123187+08:00 stdout F [29] PTR (anon) type_id=30
2024-05-13T10:27:51.105124976+08:00 stdout F [30] STRUCT xdp_md size=24 vlen=6
2024-05-13T10:27:51.10512682+08:00 stdout F 	data type_id=10 bits_offset=0
2024-05-13T10:27:51.105128732+08:00 stdout F 	data_end type_id=10 bits_offset=32
2024-05-13T10:27:51.105130611+08:00 stdout F 	data_meta type_id=10 bits_offset=64
2024-05-13T10:27:51.105132424+08:00 stdout F 	ingress_ifindex type_id=10 bits_offset=96
2024-05-13T10:27:51.105134199+08:00 stdout F 	rx_queue_index type_id=10 bits_offset=128
2024-05-13T10:27:51.105135986+08:00 stdout F 	egress_ifindex type_id=10 bits_offset=160
2024-05-13T10:27:51.105137862+08:00 stdout F [31] FUNC_PROTO (anon) return=32 args=(29 xdp)
2024-05-13T10:27:51.105139646+08:00 stdout F [32] ENUM xdp_action size=4 vlen=5
2024-05-13T10:27:51.105141569+08:00 stdout F 	XDP_ABORTED val=0
2024-05-13T10:27:51.105143442+08:00 stdout F 	XDP_DROP val=1
2024-05-13T10:27:51.10514533+08:00 stdout F 	XDP_PASS val=2
2024-05-13T10:27:51.105147193+08:00 stdout F 	XDP_TX val=3
2024-05-13T10:27:51.105149013+08:00 stdout F 	XDP_REDIRECT val=4
2024-05-13T10:27:51.105150841+08:00 stdout F [33] FUNC prefilter type_id=31 vlen != 0
2024-05-13T10:27:51.105152555+08:00 stdout F 
2024-05-13T10:27:51.105154458+08:00 stdout F libbpf: Error loading .BTF into kernel: -22.
2024-05-13T10:27:51.10515903+08:00 stdout F Error: failed to open object file
2024-05-13T10:27:51.105160817+08:00 stdout F 

OS: centos 7 (KERNEL 5.4 ) k8s: 1.29.4 calico: 3.27.3


logs: calico-node.log

codering avatar May 13 '24 03:05 codering

@codering to me it looks like you are dealing with another error, so it's better to open a new issue.

mazdakn avatar May 13 '24 17:05 mazdakn

@quick691fr could you find the issue about SELinux?

mazdakn avatar May 13 '24 17:05 mazdakn

I also encountered the same problem. The log file has been printed for 143G and has been repeating the following logs.

libbpf: Error loading .BTF into kernel: -22.
Error: failed to open object file
 try=1
2024-05-26 10:15:17.684 [WARNING][100656] felix/int_dataplane.go 1822: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 936
str_off: 936
str_len: 1142
btf_total_size: 2102
[1] PTR (anon) type_id=3
[2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1
[4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] PTR (anon) type_id=6
[6] STRUCT protoport size=4 vlen=2
        proto type_id=7 bits_offset=0
        port type_id=7 bits_offset=16
[7] TYPEDEF __u16 type_id=8
[8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
[9] PTR (anon) type_id=10
[10] TYPEDEF __u32 type_id=11
[11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[12] PTR (anon) type_id=13
[13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535
[14] STRUCT (anon) size=40 vlen=5
        type type_id=1 bits_offset=0
        key type_id=5 bits_offset=64
        value type_id=9 bits_offset=128
        max_entries type_id=12 bits_offset=192
        map_flags type_id=1 bits_offset=256
[15] VAR calico_failsafe_ports type_id=14 linkage=1
[16] PTR (anon) type_id=17
[17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11
[18] PTR (anon) type_id=19
[19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2
        lpm type_id=20 bits_offset=0
        ip type_id=24 bits_offset=0
[20] STRUCT bpf_lpm_trie_key size=4 vlen=2
        prefixlen type_id=10 bits_offset=0
        data type_id=23 bits_offset=32
[21] TYPEDEF __u8 type_id=22
[22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
[23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0
[24] STRUCT ip4key size=8 vlen=2
        mask type_id=10 bits_offset=0
        addr type_id=10 bits_offset=32
[25] PTR (anon) type_id=26
[26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240
[27] STRUCT (anon) size=40 vlen=5
        type type_id=16 bits_offset=0
        key type_id=18 bits_offset=64
        value type_id=9 bits_offset=128
        max_entries type_id=25 bits_offset=192
        map_flags type_id=1 bits_offset=256
[28] VAR calico_prefilter_v4 type_id=27 linkage=1
[29] PTR (anon) type_id=30
[30] STRUCT xdp_md size=24 vlen=6
        data type_id=10 bits_offset=0
        data_end type_id=10 bits_offset=32
        data_meta type_id=10 bits_offset=64
        ingress_ifindex type_id=10 bits_offset=96
        rx_queue_index type_id=10 bits_offset=128
        egress_ifindex type_id=10 bits_offset=160
[31] FUNC_PROTO (anon) return=32 args=(29 xdp)
[32] ENUM xdp_action size=4 vlen=5
        XDP_ABORTED val=0
        XDP_DROP val=1
        XDP_PASS val=2
        XDP_TX val=3
        XDP_REDIRECT val=4
[33] FUNC prefilter type_id=31 vlen != 0

K8s version, Deploying using kubeadm

Client Version: v1.29.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4

Calico version: v3.27.3 Deploying using yaml https://docs.tigera.io/calico/3.27/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less

os system: ubuntu kernel: 5.4.0-100-generic

limylily avatar Jun 19 '24 07:06 limylily

Hi! Getting the same error when installing on Ubuntu 20.04, with linux 5.4.0-110-generic. Had to downgrade from calico 3.28 down to 3.26.4 to get something working.

huguesgr avatar Jun 21 '24 09:06 huguesgr

so,how to fix this` problem,i have the same problem

yinfuqian avatar Jul 11 '24 08:07 yinfuqian

I also encountered the same problem. The log file has been printed for 143G and has been repeating the following logs.

libbpf: Error loading .BTF into kernel: -22.
Error: failed to open object file
 try=1
2024-05-26 10:15:17.684 [WARNING][100656] felix/int_dataplane.go 1822: failed to wipe the XDP state error=failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 936
str_off: 936
str_len: 1142
btf_total_size: 2102
[1] PTR (anon) type_id=3
[2] INT int size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=1
[4] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
[5] PTR (anon) type_id=6
[6] STRUCT protoport size=4 vlen=2
        proto type_id=7 bits_offset=0
        port type_id=7 bits_offset=16
[7] TYPEDEF __u16 type_id=8
[8] INT unsigned short size=2 bits_offset=0 nr_bits=16 encoding=(none)
[9] PTR (anon) type_id=10
[10] TYPEDEF __u32 type_id=11
[11] INT unsigned int size=4 bits_offset=0 nr_bits=32 encoding=(none)
[12] PTR (anon) type_id=13
[13] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=65535
[14] STRUCT (anon) size=40 vlen=5
        type type_id=1 bits_offset=0
        key type_id=5 bits_offset=64
        value type_id=9 bits_offset=128
        max_entries type_id=12 bits_offset=192
        map_flags type_id=1 bits_offset=256
[15] VAR calico_failsafe_ports type_id=14 linkage=1
[16] PTR (anon) type_id=17
[17] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=11
[18] PTR (anon) type_id=19
[19] UNION ip4_bpf_lpm_trie_key size=8 vlen=2
        lpm type_id=20 bits_offset=0
        ip type_id=24 bits_offset=0
[20] STRUCT bpf_lpm_trie_key size=4 vlen=2
        prefixlen type_id=10 bits_offset=0
        data type_id=23 bits_offset=32
[21] TYPEDEF __u8 type_id=22
[22] INT unsigned char size=1 bits_offset=0 nr_bits=8 encoding=(none)
[23] ARRAY (anon) type_id=21 index_type_id=4 nr_elems=0
[24] STRUCT ip4key size=8 vlen=2
        mask type_id=10 bits_offset=0
        addr type_id=10 bits_offset=32
[25] PTR (anon) type_id=26
[26] ARRAY (anon) type_id=2 index_type_id=4 nr_elems=10240
[27] STRUCT (anon) size=40 vlen=5
        type type_id=16 bits_offset=0
        key type_id=18 bits_offset=64
        value type_id=9 bits_offset=128
        max_entries type_id=25 bits_offset=192
        map_flags type_id=1 bits_offset=256
[28] VAR calico_prefilter_v4 type_id=27 linkage=1
[29] PTR (anon) type_id=30
[30] STRUCT xdp_md size=24 vlen=6
        data type_id=10 bits_offset=0
        data_end type_id=10 bits_offset=32
        data_meta type_id=10 bits_offset=64
        ingress_ifindex type_id=10 bits_offset=96
        rx_queue_index type_id=10 bits_offset=128
        egress_ifindex type_id=10 bits_offset=160
[31] FUNC_PROTO (anon) return=32 args=(29 xdp)
[32] ENUM xdp_action size=4 vlen=5
        XDP_ABORTED val=0
        XDP_DROP val=1
        XDP_PASS val=2
        XDP_TX val=3
        XDP_REDIRECT val=4
[33] FUNC prefilter type_id=31 vlen != 0

K8s version, Deploying using kubeadm

Client Version: v1.29.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4

Calico version: v3.27.3 Deploying using yaml https://docs.tigera.io/calico/3.27/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less

os system: ubuntu kernel: 5.4.0-100-generic

My solution is to disable xdpEnabled

The specific operation is as follows:

sudo kubectl edit felixconfigurations default

apiVersion: crd.projectcalico.org/v1
kind: FelixConfiguration
metadata:
  annotations:
  generation: 2
  name: default
spec:
  bpfConnectTimeLoadBalancing: TCP
  bpfHostNetworkedNATWithoutCTLB: Enabled
  bpfLogLevel: ""
  floatingIPs: Disabled
  logSeverityScreen: Info
  reportingInterval: 0s
  # Add the following content
  xdpEnabled: false

# Delete pods of Calico nodes, do not delete in batches, delete in batches
sudo kubectl delete pod calico-node-hkgc8 -n kube-system

limylily avatar Jul 11 '24 12:07 limylily

so,how to fix this` problem,i have the same problem

Go downstairs and reply

limylily avatar Jul 11 '24 12:07 limylily

@huguesgr @limylily @limylily have you tried the latest 3.27 patch release? We recently upgrade bpftool to v7.4 to fix this GH issue: https://github.com/projectcalico/calico/issues/8856

It seems similar to some of the ones discussed here. The fix will be available in the next 3.28 patch release which will be cut soon.

mazdakn avatar Jul 11 '24 15:07 mazdakn

Hi @mazdakn, it seems fine with 3.27.4 indeed, thanks!

huguesgr avatar Aug 01 '24 05:08 huguesgr