kube-router icon indicating copy to clipboard operation
kube-router copied to clipboard

Kube-router crashes on fresh node when workload gets scheduled

Open rkojedzinszky opened this issue 3 years ago • 2 comments

What happened? Kube-router crashes when a freshly added node gets workload scheduled.

What did you expect to happen? Normal operation

How can we reproduce the behavior you experienced? Steps to reproduce the behavior:

  1. Have freshly added node to Kubernetes, with kube-router and kube-proxy running
  2. Schedule workload to that node (e.g. coredns)
  3. Kube-router will crash

** System Information (please complete the following information):**

  • Kube-Router Version (kube-router --version):
Running kube-router version v1.4.0, built on 2022-01-05T17:01:42+0000, go1.17.5
  • Kube-Router Parameters: [e.g. --run-router --run-service-proxy --enable-overlay --overlay-type=full etc.]
        - --run-router=true
        - --run-firewall=true
        - --run-service-proxy=false
        - --bgp-graceful-restart=true
  • Kubernetes Version (kubectl version) :
# kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.7", GitCommit:"b56e432f2191419647a6a13b9f5867801850f969", GitTreeState:"clean", BuildDate:"2022-02-16T11:50:27Z", GoVersion:"go1.16.14", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.7", GitCommit:"b56e432f2191419647a6a13b9f5867801850f969", GitTreeState:"clean", BuildDate:"2022-02-16T11:43:55Z", GoVersion:"go1.16.14", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud Type: [e.g. AWS, GCP, Azure, on premise]: bare metal
  • Kubernetes Deployment Type: [e.g. EKS, GKE, Kops, Kubeadm, etc.] kubeadm
  • Kube-Router Deployment Type: [e.g. DaemonSet, System Service] daemonset
  • Cluster Size: [e.g. 200 Nodes] 3 nodes

** Logs, other output, metrics ** See https://asciinema.org/a/471319

Got logs from kube-router:

E0224 09:45:49.163103       1 network_policy_controller.go:276] Aborting sync. Failed to run iptables-restore: exit status 4 (iptables-restore v1.8.7 (nf_tables): 
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-EXTERNAL-SERVICES
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-FIREWALL
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-FORWARD
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-KUBELET-CANARY
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-NODEPORTS
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-NWPLCY-DEFAULT
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-PROXY-CANARY
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-ROUTER-FORWARD
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-ROUTER-INPUT
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-ROUTER-OUTPUT
line 60: CHAIN_USER_ADD failed (File exists): chain KUBE-SERVICES
)
*filter
:INPUT ACCEPT [58:48907] - [0:0]
:FORWARD ACCEPT [0:0] - [0:0]
:OUTPUT ACCEPT [58:6455] - [0:0]
:KUBE-EXTERNAL-SERVICES - [0:0] - [0:0]
:KUBE-FIREWALL - [0:0] - [0:0]
:KUBE-FORWARD - [0:0] - [0:0]
:KUBE-KUBELET-CANARY - [0:0] - [0:0]
:KUBE-NODEPORTS - [0:0] - [0:0]
:KUBE-NWPLCY-DEFAULT - [0:0] - [0:0]
:KUBE-PROXY-CANARY - [0:0] - [0:0]
:KUBE-ROUTER-FORWARD - [0:0] - [0:0]
:KUBE-ROUTER-INPUT - [0:0] - [0:0]
:KUBE-ROUTER-OUTPUT - [0:0] - [0:0]
:KUBE-SERVICES - [0:0] - [0:0]
:KUBE-POD-FW-5BZDTUBTJ6Y46WZY - [0:0]
-A INPUT -m comment --comment "kube-router netpol - 4IA2OSFRMVNDXBVV" -j KUBE-ROUTER-INPUT
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kube-router netpol - TEMCG2JMHZYE7H7T" -j KUBE-ROUTER-FORWARD
-A FORWARD -o ens3 -m comment --comment "allow outbound node port traffic on node interface with which node ip is associated" -j ACCEPT
-A FORWARD -o kube-bridge -m comment --comment "allow inbound traffic to pods" -j ACCEPT
-A FORWARD -i kube-bridge -m comment --comment "allow outbound traffic from pods" -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A OUTPUT -m comment --comment "kube-router netpol - VEAAIY32XVBHCSCY" -j KUBE-ROUTER-OUTPUT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m mark --mark 0x8000/0x8000 -m comment --comment "kubernetes firewall for dropping marked packets" -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -m comment --comment "block incoming localnet connections" -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-NWPLCY-DEFAULT -m comment --comment "rule to mark traffic matching a network policy" -j MARK --set-xmark 0x10000/0x10000
-A KUBE-ROUTER-INPUT -d 10.96.0.0/12 -m comment --comment "allow traffic to cluster IP - 4H2UH6XHRCCZXCYQ" -j RETURN
-A KUBE-ROUTER-INPUT -p tcp -m comment --comment "allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M" -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -j RETURN
-A KUBE-ROUTER-INPUT -p udp -m comment --comment "allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ" -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -j RETURN
-I KUBE-POD-FW-5BZDTUBTJ6Y46WZY 1 -d 10.112.2.12 -m comment --comment "run through default ingress network policy  chain" -j KUBE-NWPLCY-DEFAULT 
-I KUBE-POD-FW-5BZDTUBTJ6Y46WZY 1 -s 10.112.2.12 -m comment --comment "run through default egress network policy  chain" -j KUBE-NWPLCY-DEFAULT 
-I KUBE-POD-FW-5BZDTUBTJ6Y46WZY 1 -m comment --comment "rule to permit the traffic to pods when source is the pod's local node" -m addrtype --src-type LOCAL -d 10.112.2.12 -j ACCEPT 
-I KUBE-POD-FW-5BZDTUBTJ6Y46WZY 1 -m comment --comment "rule to drop invalid state for pod" -m conntrack --ctstate INVALID -j DROP 
-I KUBE-POD-FW-5BZDTUBTJ6Y46WZY 1 -m comment --comment "rule for stateful firewall for pod" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 
-A KUBE-ROUTER-FORWARD -m comment --comment "rule to jump traffic destined to POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -d 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY
-A KUBE-ROUTER-OUTPUT -m comment --comment "rule to jump traffic destined to POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -d 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY
-A KUBE-ROUTER-FORWARD -m physdev --physdev-is-bridged -m comment --comment "rule to jump traffic destined to POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -d 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY 
-A KUBE-ROUTER-INPUT -m comment --comment "rule to jump traffic from POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -s 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY 
-A KUBE-ROUTER-FORWARD -m comment --comment "rule to jump traffic from POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -s 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY 
-A KUBE-ROUTER-OUTPUT -m comment --comment "rule to jump traffic from POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -s 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY 
-A KUBE-ROUTER-FORWARD -m physdev --physdev-is-bridged -m comment --comment "rule to jump traffic from POD name:coredns-78fcd69978-m5ccm namespace: kube-system to chain KUBE-POD-FW-5BZDTUBTJ6Y46WZY" -s 10.112.2.12 -j KUBE-POD-FW-5BZDTUBTJ6Y46WZY 
-A KUBE-POD-FW-5BZDTUBTJ6Y46WZY -m comment --comment "rule to log dropped traffic POD name:coredns-78fcd69978-m5ccm namespace: kube-system" -m mark ! --mark 0x10000/0x10000 -j NFLOG --nflog-group 100 -m limit --limit 10/minute --limit-burst 10 
-A KUBE-POD-FW-5BZDTUBTJ6Y46WZY -m comment --comment "rule to REJECT traffic destined for POD name:coredns-78fcd69978-m5ccm namespace: kube-system" -m mark ! --mark 0x10000/0x10000 -j REJECT 
-A KUBE-POD-FW-5BZDTUBTJ6Y46WZY -j MARK --set-mark 0/0x10000 
-A KUBE-POD-FW-5BZDTUBTJ6Y46WZY -m comment --comment "set mark to ACCEPT traffic that comply to network policies" -j MARK --set-mark 0x20000/0x20000 
-A KUBE-ROUTER-OUTPUT -m comment --comment "rule to explicitly ACCEPT traffic that comply to network policies" -m mark --mark 0x20000/0x20000 -j ACCEPT 
-A KUBE-ROUTER-INPUT -m comment --comment "rule to explicitly ACCEPT traffic that comply to network policies" -m mark --mark 0x20000/0x20000 -j ACCEPT 
-A KUBE-ROUTER-FORWARD -m comment --comment "rule to explicitly ACCEPT traffic that comply to network policies" -m mark --mark 0x20000/0x20000 -j ACCEPT 
COMMIT

F0224 09:47:00.014333       1 network_policy_controller.go:369] failed to run iptables command to create KUBE-ROUTER-INPUT chain due to running [/sbin/iptables -t filter -N KUBE-ROUTER-INPUT --wait]: exit status 4: iptables v1.8.7 (nf_tables):  CHAIN_USER_ADD failed (File exists): chain KUBE-ROUTER-INPUT

rkojedzinszky avatar Feb 24 '22 09:02 rkojedzinszky

Can you post the output of iptables-save and nft list ruleset?

Also, assuming this is a test cluster and you are willing, can you try removing kube-proxy and running kube-router with --run-service-proxy=true?

I have a suspicion that this is a conflict in the way nftables handles the iptables wrapper and the fact that both kube-proxy and kube-router are trying to use iptables via nftables.

aauren avatar Feb 25 '22 03:02 aauren

Can you post the output of iptables-save and nft list ruleset?

outputs.zip

Also, assuming this is a test cluster and you are willing, can you try removing kube-proxy and running kube-router with --run-service-proxy=true?

Yes, I am working on migrating from kube-proxy to kube-router. I already have clusters with just kube-router, and have no such issues. Only with a fresh node with kube-proxy daemonset.

I have a suspicion that this is a conflict in the way nftables handles the iptables wrapper and the fact that both kube-proxy and kube-router are trying to use iptables via nftables.

This not happens all the time, so I dont exactly know how to reproduce. Strange is that I have 1 master and 2 worker node, and when draining the master, the 2 coredns pods gets scheduled to different worker nodes. And most of the time, only the second worker's kube-router crashes.

rkojedzinszky avatar Feb 25 '22 08:02 rkojedzinszky

@rkojedzinszky You other issue reminded me of this one, sorry I never got back here, it's been a crazy 2022. 😛

Are you still encountering this issue?

aauren avatar Oct 31 '22 03:10 aauren

@rkojedzinszky You other issue reminded me of this one, sorry I never got back here, it's been a crazy 2022. stuck_out_tongue

Are you still encountering this issue?

I've not met this issue in the past months, so I am closing it.

rkojedzinszky avatar Nov 03 '22 07:11 rkojedzinszky