flannel icon indicating copy to clipboard operation
flannel copied to clipboard

External Connectivity Loss During Flannel Pod Restart

Open devasmith opened this issue 6 months ago • 5 comments
trafficstars

Description

When restarting the Flannel pod (via the rke2-canal pod in an RKE2 cluster), all rules in the FLANNEL-POSTRTG chain of the iptables NAT table are deleted. This results in a temporary loss of external connectivity for pods on the same node until the Flannel pod is fully restarted and the rules are reconciled.

This behavior disrupts workloads that rely on external connectivity and appears to be related to how Flannel handles rule reconciliation during startup.

Steps to Reproduce

  1. Start monitoring the FLANNEL-POSTRTG chain in the iptables NAT table:
    watch -n0.1 "iptables -t nat -L FLANNEL-POSTRTG -n -v"
    
  2. Run a continuous curl loop from a pod on the same node to an external endpoint (e.g., https://www.google.com) to monitor connectivity:
    while true; do curl -o /dev/null -s -w 'Establish Connection: %{time_connect}s  TTFB: %{time_starttransfer}s  Total: %{time_total}s\n\n' https://www.google.com; done
    
  3. Delete the Flannel pod (or rke2-canal pod in RKE2):
    kubectl delete pod -n kube-system <flannel-pod-name>
    
  4. Observe that all rules in the FLANNEL-POSTRTG chain are deleted until the Flannel pod is fully restarted.

Observed Behavior

  • During the restart of the Flannel pod, the FLANNEL-POSTRTG chain in the iptables NAT table is emptied.
  • External connectivity for pods on the same node is lost until the Flannel pod completes its startup and reconciles the rules.

Example of FLANNEL-POSTRTG Chain Before Restart:

Chain FLANNEL-POSTRTG (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x4000/0x4000 /* flanneld masq */
   40  3000 RETURN     all  --  *      *       10.42.0.0/24         10.42.0.0/16         /* flanneld masq */
    0     0 RETURN     all  --  *      *       10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
   13   780 RETURN     all  --  *      *      !10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    8   480 MASQUERADE  all  --  *      *       10.42.0.0/16        !224.0.0.0/4          /* flanneld masq */ random-fully
    0     0 MASQUERADE  all  --  *      *      !10.42.0.0/16         10.42.0.0/16         /* flanneld masq */ random-fully

Example of FLANNEL-POSTRTG Chain After Restart:

Chain FLANNEL-POSTRTG (1 references)
 pkts bytes target     prot opt in     out     source               destination

Expected Behavior

The FLANNEL-POSTRTG chain should not be emptied during the restart of the Flannel pod. Existing rules should remain intact to avoid disruption of external connectivity for pods.

Environment

Flannel Version: v0.26.5 RKE2 Version: v1.31.7~rke2r1 iptables Version: v1.8.10 OS: Rocky 9 (5.14.0-503.40.1.el9_5.x86_64)

Additional Context

This issue does not occur with other CNI implementations, such as Cilium, which uses a kube-proxy-free replacement and does not rely on iptables in the same way.

Reference issue: https://github.com/rancher/rke2/issues/8151

devasmith avatar May 13 '25 06:05 devasmith

This might be related to this release: https://github.com/flannel-io/flannel/pull/1881/commits/f244861ae8bff9bcebd48a641859c135076186cb

The PR also fixes the clean-up mechanism in the iptables implementation.

devasmith avatar May 14 '25 06:05 devasmith

Thanks for the report. We can try to make the iptables reconciliation quicker but I'm not sure we can entirely avoid disrupting connections given the way iptables work.

thomasferrandiz avatar May 14 '25 08:05 thomasferrandiz

Hi! I work with @devasmith and is the one who found the cause here. I think the only way to avoid the network being down is to not cleanup the rules and reconcile them after the restart instead (which was the previous behavior). I'm pretty sure that's why it wasn't cleanup before in the implemetation. To avoid downtime of the k8s cluster during a CNI upgrade.

jonaz avatar May 15 '25 06:05 jonaz

I am +1 on not defaulting to deleting all the rules on shutdown. This seems like a major change to slip into another commit as an in-passing bugfix. Something like that should probably be gated on a flag or env var. I think most users would want the rules to stay in-place and then just get reconciled again as a delta on startup.

brandond avatar May 15 '25 21:05 brandond

The fix was release in https://github.com/flannel-io/flannel/releases/tag/v0.27.0

thomasferrandiz avatar Jun 05 '25 14:06 thomasferrandiz