egressgateway icon indicating copy to clipboard operation
egressgateway copied to clipboard

Calico CNI with Multiple Interface -- Pods located on non-egress nodes not able to reach external destinations

Open saibaldey opened this issue 8 months ago • 9 comments

Describe the version calico -- v3.28.3 Egress -- v0.6.2

Describe the bug When having Calico CNI with Multiple Interface, Egress not working for pods located on non-egress nodes, however the pods running on egress nodes are able to connect to the external IPs.

How To Reproduce

  1. Deployed Egress:
helm install egressgateway ./egressgateway \
        -n kube-system \
        --set feature.tunnelIpv4Subnet="192.200.0.1/16" \
        --wait --debug
  1. Deployed Egress Gateway:
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressGateway
metadata:
  name: "idm-stg-egress-gateway"
spec:
  ippools:
    ipv4:
    - "10.18.162.144"
  nodeSelector:
    selector:
      matchLabels:
        role: gateway
  1. Deployed Egress Policy:
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressPolicy
metadata:
  name: idm-stg-egress-policy
  namespace: idm-stg
spec:
  egressIP:
    useNodeIP: false
  appliedTo:
    podSelector:
      matchLabels:
      #egress: "true"
       app: "idm"
  egressGatewayName: idm-stg-egress-gateway
  1. Tried external IP access from pods deployed on engress node:
/home # traceroute 10.18.166.9
traceroute to 10.18.166.9 (10.18.166.9), 30 hops max, 46 byte packets
 1  10.18.162.135 (10.18.162.135)  0.014 ms  0.010 ms  0.007 ms
 2  10.18.162.129 (10.18.162.129)  0.910 ms  1.601 ms  0.966 ms
 3  10.9.100.197 (10.9.100.197)  0.834 ms  0.660 ms  0.496 ms
 4  *^C
/home # echo| telnet 10.18.166.9 3360
Connected to 10.18.166.9
  1. Tried external IP access from pods deployed on non-engress node:
/home # traceroute 10.18.166.9
traceroute to 10.18.166.9 (10.18.166.9), 30 hops max, 46 byte packets
 1  10.18.162.137 (10.18.162.137)  0.013 ms  0.012 ms  0.007 ms
 2  *  *^C
/home # echo| telnet 10.18.166.9 3360
telnet: can't connect to remote host (10.18.166.9): Connection timed out

Expected behavior Expected behavior is for traffic to work, client pods on non-egress nodes should be able to reach external destinations (via egress gateway)

Screenshots and log If applicable, add screenshots and log to help explain your problem.

Additional context Worker nodes have multiple interface .. ens192 & ens224

& we have received the EIP from the subnet associated to ens192 interface & decided to use the same.

helm install egressgateway ./egressgateway \
      -n kube-system \
        --set feature.tunnelIpv4Subnet="192.200.0.1/16" \
        --set feature.tunnelDetectMethod="interface=ens192" \
        --set feature.enableGatewayReplyRoute=true \
        --wait --debug

& then tried with below command, however the result remain same

[svc_dms_admin@a0420tnedmsk8m01 egress]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:a5:3e:de brd ff:ff:ff:ff:ff:ff
    altname enp11s0
    inet 10.18.162.134/26 brd 10.18.162.191 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:a5:5b:a0 brd ff:ff:ff:ff:ff:ff
    altname enp19s0
    inet 10.12.132.70/26 brd 10.12.132.127 scope global noprefixroute ens224
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:b1:0f:2f:19 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
36: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 192.168.245.64/32 scope global tunl0
       valid_lft forever preferred_lft forever
88: egress.vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 66:f3:37:7a:aa:26 brd ff:ff:ff:ff:ff:ff
    inet 192.200.121.210/16 brd 192.200.255.255 scope global egress.vxlan
       valid_lft forever preferred_lft forever
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get  10.18.162.137
10.18.162.137 dev ens192 src 10.18.162.134 uid 1041
    cache
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get 10.18.162.135
10.18.162.135 dev ens192 src 10.18.162.134 uid 1041
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get 192.168.127.66
192.168.127.66 via 10.12.132.73 dev ens224 src 10.12.132.70 uid 1041
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get 192.168.40.66
192.168.40.66 via 10.12.132.71 dev ens224 src 10.12.132.70 uid 1041
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get 10.18.162.136
10.18.162.136 dev ens192 src 10.18.162.134 uid 1041
    cache
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ ip route get 192.168.127.67
192.168.127.67 via 10.12.132.73 dev ens224 src 10.12.132.70 uid 1041
    cache
[svc_dms_admin@a0420tnedmsk8w01 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:a5:36:42 brd ff:ff:ff:ff:ff:ff
    altname enp11s0
    inet 10.18.162.135/26 brd 10.18.162.191 scope global noprefixroute ens192
       valid_lft forever preferred_lft forever
3: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:a5:00:ec brd ff:ff:ff:ff:ff:ff
    altname enp19s0
    inet 10.12.132.71/26 brd 10.12.132.127 scope global noprefixroute ens224
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:8f:89:89:5f brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
47: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 192.168.40.64/32 scope global tunl0
       valid_lft forever preferred_lft forever
130: cali277860b0ea8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default qlen 1000
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-1dff1503-b852-d3c7-5904-78c26a1486db
132: calieef9b5f24c1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default qlen 1000
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-2cc3e11b-a426-5da2-28dd-8e5ea0c3e6f1
137: egress.vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 66:2b:ef:14:19:78 brd ff:ff:ff:ff:ff:ff
    inet 192.200.151.60/16 brd 192.200.255.255 scope global egress.vxlan
       valid_lft forever preferred_lft forever
[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ kubectl get egresstunnel -o yaml
apiVersion: v1
items:
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8m01.grameenphone.com
    resourceVersion: "4818652"
    uid: aa47d4c7-b0fe-4954-a9a3-870a5d9cd318
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:45Z"
    mark: "0x26436ff2"
    phase: Ready
    tunnel:
      ipv4: 192.200.121.210
      mac: 66:f3:37:7a:aa:26
      parent:
        ipv4: 10.18.162.134
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w01.grameenphone.com
    resourceVersion: "4818667"
    uid: abb2b9ea-5b36-462c-9e3f-073d267efb64
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x268157a6"
    phase: Ready
    tunnel:
      ipv4: 192.200.151.60
      mac: 66:2b:ef:14:19:78
      parent:
        ipv4: 10.18.162.135
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w02.grameenphone.com
    resourceVersion: "4818666"
    uid: ff49cc05-03a1-4e91-a020-dad905f8073d
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x26546c92"
    phase: Ready
    tunnel:
      ipv4: 192.200.69.19
      mac: 66:2a:5d:65:40:23
      parent:
        ipv4: 10.18.162.136
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w03.grameenphone.com
    resourceVersion: "4818665"
    uid: de4ccd14-39af-4a22-a0da-e401868c1496
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x2610c890"
    phase: Ready
    tunnel:
      ipv4: 192.200.15.12
      mac: 66:a9:54:7d:1d:84
      parent:
        ipv4: 10.18.162.137
        name: ens192
kind: List
metadata:
  resourceVersion: ""

Master Node:

[svc_dms_admin@a0420tnedmsk8m01 ~]$ sudo iptables-save -c | grep -e ^* -e EGRESSGATEWAY
*filter
*nat
:EGRESSGATEWAY-SNAT-EIP - [0:0]
[2708428:209993440] -A POSTROUTING -m comment --comment "egw:x1tdBi75jif7GCxh" -m comment --comment "SNAT for egress traffic" -j EGRESSGATEWAY-SNAT-EIP
*mangle
:EGRESSGATEWAY-MARK-REQUEST - [0:0]
[17576065:4756896825] -A PREROUTING -m comment --comment "egw:Lh98b3mb9WlZrgw7" -m comment --comment "Checking for EgressPolicy matched traffic" -j EGRESSGATEWAY-MARK-REQUEST
[0:0] -A EGRESSGATEWAY-MARK-REQUEST -m comment --comment "egw:gc2YmNaOvMLZmwkl" -m comment --comment "Set mark for EgressPolicy idm-stg-idm-stg-egress-policy" -m set --match-set egress-src-v4-205d1eb3fc1220351 src -m set ! --match-set egress-cluster-cidr-ipv4 dst -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x268157a6/0xffffffff
*raw

Worker 1

[svc_dms_admin@a0420tnedmsk8w01 ~]$ sudo iptables-save -c | grep -e ^* -e EGRESSGATEWAY
*filter
*nat
:EGRESSGATEWAY-SNAT-EIP - [0:0]
[3250859:248302673] -A POSTROUTING -m comment --comment "egw:x1tdBi75jif7GCxh" -m comment --comment "SNAT for egress traffic" -j EGRESSGATEWAY-SNAT-EIP
[1:60] -A EGRESSGATEWAY-SNAT-EIP -m comment --comment "egw:XxIM2wCnOMtos724" -m comment --comment "snat policy idm-stg-idm-stg-egress-policy" -m set --match-set egress-src-v4-205d1eb3fc1220351 src -m set ! --match-set egress-cluster-cidr-ipv4 dst -m conntrack --ctdir ORIGINAL -j SNAT --to-source 10.18.162.144
*mangle
:EGRESSGATEWAY-MARK-REQUEST - [0:0]
:EGRESSGATEWAY-REPLY-ROUTING - [0:0]
[1233797:1634889705] -A PREROUTING -m comment --comment "egw:Lh98b3mb9WlZrgw7" -m comment --comment "Checking for EgressPolicy matched traffic" -j EGRESSGATEWAY-MARK-REQUEST
[1233797:1634889705] -A PREROUTING -m comment --comment "egw:T0xtgSfAE9gtzdEV" -m comment --comment "EgressGateway reply datapath rule, rule is from the EgressGateway" -j EGRESSGATEWAY-REPLY-ROUTING
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:Zb7UxCu5AnQgEMUu" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:F3:37:7A:AA:26 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x26436ff2/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:yTQbFQs0k01uI4Vl" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x26436ff2 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:EuD8j1XAKR7hDFA6" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:F3:37:7A:AA:26 -j MARK --set-xmark 0x26000000/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:R2btXiIdMA-e4FzT" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:2A:5D:65:40:23 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x26546c92/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:x5q_d5XrM-MpH7hb" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x26546c92 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:JiRBYkHqPDQz9j1j" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:2A:5D:65:40:23 -j MARK --set-xmark 0x26000000/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:UOXw0DnC1eBbq8Wn" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:A9:54:7D:1D:84 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x2610c890/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:sn21QnnV3YWai4QV" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x2610c890 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:5REJVw5IlT_zYqTa" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:A9:54:7D:1D:84 -j MARK --set-xmark 0x26000000/0xffffffff
[7341:7631347] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:rVm6RtJXlimVmyZe" -m comment --comment "label for restoring connections, rule is from the EgressGateway" -m conntrack --ctdir REPLY -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
*raw

Worker 2

[svc_dms_admin@a0420tnedmsk8w02 ~]$ sudo iptables-save -c | grep -e ^* -e EGRESSGATEWAY
*filter
*nat
:EGRESSGATEWAY-SNAT-EIP - [0:0]
[669681:42387609] -A POSTROUTING -m comment --comment "egw:x1tdBi75jif7GCxh" -m comment --comment "SNAT for egress traffic" -j EGRESSGATEWAY-SNAT-EIP
*mangle
:EGRESSGATEWAY-MARK-REQUEST - [0:0]
:EGRESSGATEWAY-REPLY-ROUTING - [0:0]
[1009643:698850922] -A PREROUTING -m comment --comment "egw:Lh98b3mb9WlZrgw7" -m comment --comment "Checking for EgressPolicy matched traffic" -j EGRESSGATEWAY-MARK-REQUEST
[1009643:698850922] -A PREROUTING -m comment --comment "egw:T0xtgSfAE9gtzdEV" -m comment --comment "EgressGateway reply datapath rule, rule is from the EgressGateway" -j EGRESSGATEWAY-REPLY-ROUTING
[0:0] -A EGRESSGATEWAY-MARK-REQUEST -m comment --comment "egw:gc2YmNaOvMLZmwkl" -m comment --comment "Set mark for EgressPolicy idm-stg-idm-stg-egress-policy" -m set --match-set egress-src-v4-205d1eb3fc1220351 src -m set ! --match-set egress-cluster-cidr-ipv4 dst -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x268157a6/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:AvT0WHHYiQGzaLpK" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:2B:EF:14:19:78 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x268157a6/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:UPqFq165Tw4l7uGY" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x268157a6 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:pUJiC6zNYbXoVYxr" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:2B:EF:14:19:78 -j MARK --set-xmark 0x26000000/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:3TwRMam2IUZjsDP8" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:A9:54:7D:1D:84 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x2610c890/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:ONG950P8ohJyJGLb" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x2610c890 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:1-d4SyELWPkIBBLA" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:A9:54:7D:1D:84 -j MARK --set-xmark 0x26000000/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:DTh5WpkNmsmZQej6" -m comment --comment "Mark the traffic from the EgressGateway tunnel, rule is from the EgressGateway" -m mac --mac-source 66:F3:37:7A:AA:26 -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x26436ff2/0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:q5H5zxCFqJx8CMIc" -m comment --comment "Save mark to the connection, rule is from the EgressGateway" -m mark --mark 0x26436ff2 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
[0:0] -A EGRESSGATEWAY-REPLY-ROUTING -i egress.vxlan -m comment --comment "egw:L8zH4kEOAYl1iOiZ" -m comment --comment "Clear Mark of the inner package, rule is from the EgressGateway" -m mac --mac-source 66:F3:37:7A:AA:26 -j MARK --set-xmark 0x26000000/0xffffffff
[22055:17858063] -A EGRESSGATEWAY-REPLY-ROUTING -m comment --comment "egw:5LX_fe5ELye3ePQc" -m comment --comment "label for restoring connections, rule is from the EgressGateway" -m conntrack --ctdir REPLY -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
*raw

Worker 3

[svc_dms_admin@a0420tnedmsk8w03 ~]$ sudo iptables-save -c | grep -e ^* -e EGRESSGATEWAY
*filter
*nat
:EGRESSGATEWAY-SNAT-EIP - [0:0]
[747804:48049736] -A POSTROUTING -m comment --comment "egw:x1tdBi75jif7GCxh" -m comment --comment "SNAT for egress traffic" -j EGRESSGATEWAY-SNAT-EIP
*mangle
:EGRESSGATEWAY-MARK-REQUEST - [0:0]
[1418404:834339662] -A PREROUTING -m comment --comment "egw:Lh98b3mb9WlZrgw7" -m comment --comment "Checking for EgressPolicy matched traffic" -j EGRESSGATEWAY-MARK-REQUEST
[8:480] -A EGRESSGATEWAY-MARK-REQUEST -m comment --comment "egw:gc2YmNaOvMLZmwkl" -m comment --comment "Set mark for EgressPolicy idm-stg-idm-stg-egress-policy" -m set --match-set egress-src-v4-205d1eb3fc1220351 src -m set ! --match-set egress-cluster-cidr-ipv4 dst -m conntrack --ctdir ORIGINAL -j MARK --set-xmark 0x268157a6/0xffffffff
*raw

saibaldey avatar Mar 25 '25 13:03 saibaldey

@lou-lan @tboerger @yankay @rfyiamcool ... we are totally blocked & need your help on this. In case if more information are required, please let us know.

saibaldey avatar Mar 25 '25 14:03 saibaldey

My experience have been that changes don't work properly (at least for us) and we ended by uninstall the gateway, rebooting all nodes and installing the gateway again to properly handle new base configurations

tboerger avatar Mar 25 '25 14:03 tboerger

Hi, please check if the Egress Tunnels can access each other:

kubectl get EgressTunnel

lou-lan avatar Mar 26 '25 04:03 lou-lan

@lou-lan .. here is the output ..

[svc_dms_admin@a0420tnedmsk8m01 calico_yamls]$ kubectl get egresstunnel -o yaml
apiVersion: v1
items:
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8m01.grameenphone.com
    resourceVersion: "4818652"
    uid: aa47d4c7-b0fe-4954-a9a3-870a5d9cd318
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:45Z"
    mark: "0x26436ff2"
    phase: Ready
    tunnel:
      ipv4: 192.200.121.210
      mac: 66:f3:37:7a:aa:26
      parent:
        ipv4: 10.18.162.134
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w01.grameenphone.com
    resourceVersion: "4818667"
    uid: abb2b9ea-5b36-462c-9e3f-073d267efb64
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x268157a6"
    phase: Ready
    tunnel:
      ipv4: 192.200.151.60
      mac: 66:2b:ef:14:19:78
      parent:
        ipv4: 10.18.162.135
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w02.grameenphone.com
    resourceVersion: "4818666"
    uid: ff49cc05-03a1-4e91-a020-dad905f8073d
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x26546c92"
    phase: Ready
    tunnel:
      ipv4: 192.200.69.19
      mac: 66:2a:5d:65:40:23
      parent:
        ipv4: 10.18.162.136
        name: ens192
- apiVersion: egressgateway.spidernet.io/v1beta1
  kind: EgressTunnel
  metadata:
    creationTimestamp: "2025-03-24T13:46:45Z"
    finalizers:
    - egressgateway.spidernet.io/egresstunnel
    generation: 1
    name: a0420tnedmsk8w03.grameenphone.com
    resourceVersion: "4818665"
    uid: de4ccd14-39af-4a22-a0da-e401868c1496
  spec: {}
  status:
    lastHeartbeatTime: "2025-03-24T13:46:46Z"
    mark: "0x2610c890"
    phase: Ready
    tunnel:
      ipv4: 192.200.15.12
      mac: 66:a9:54:7d:1d:84
      parent:
        ipv4: 10.18.162.137
        name: ens192
kind: List
metadata:
  resourceVersion: ""

saibaldey avatar Mar 26 '25 04:03 saibaldey

My experience have been that changes don't work properly (at least for us) and we ended by uninstall the gateway, rebooting all nodes and installing the gateway again to properly handle new base configurations

Thanks @tboerger for sharing your experience. We will do needful & share the results.

saibaldey avatar Mar 26 '25 04:03 saibaldey

Just to update, we were able to reproduce the issue in another environment & now we are sure the root cause is multiple interface. With just single interface, things are working fine & when having multiple interface, the pods on the non-egress nodes are not able to reach the external resources. FYI @tboerger / @lou-lan

saibaldey avatar Mar 26 '25 12:03 saibaldey

My experience have been that changes don't work properly (at least for us) and we ended by uninstall the gateway, rebooting all nodes and installing the gateway again to properly handle new base configurations

Thanks @tboerger for sharing your experience. We will do needful & share the results.

even after reboot it is still the same @tboerger

saibaldey avatar Mar 26 '25 12:03 saibaldey

Any update @tboerger @lou-lan @weizhoublue @yankay on this issue? Just confirm if this solution supports VMs with multiple interfaces, so that we can wait, else will explore other options.

saibaldey avatar Apr 10 '25 15:04 saibaldey

I'm not really involved in this project beside some tiny contribution.

tboerger avatar Apr 10 '25 18:04 tboerger

@lou-lan

weizhoublue avatar Jun 04 '25 01:06 weizhoublue

Any update @tboerger @lou-lan @weizhoublue @yankay on this issue? Just confirm if this solution supports VMs with multiple interfaces, so that we can wait, else will explore other options.

Sorry for the late reply. Could you please list your network interfaces by running:

ip a

Also, please run:

kubectl get egp

Note: Even if you have multiple interfaces, please make sure the egress IP you’re using belongs to the subnet of one of them.

lou-lan avatar Jun 04 '25 01:06 lou-lan

This issue has had no updates for a long time, so it will be temporarily closed. You can try the new v0.6.7 version, which includes a fix for the issue caused by rp_filter. If any problems arise, please feel free to reopen it.

lou-lan avatar Oct 29 '25 04:10 lou-lan