meshnet-cni
meshnet-cni copied to clipboard
The POD connected by vxLAN cannot be pinged through
I have an OSPF topology with 10 nodes and they all run the same frrouting image
I want to test this topology with meshnet-cni
My k8s cluster has a total of 4 nodes connected through calico BGP mode
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 7h49m v1.24.0 192.168.22.1 <none> Ubuntu 20.04.4 LTS 5.4.0-113-generic containerd://1.6.4
node-d Ready <none> 7h47m v1.24.0 192.168.22.3 <none> Ubuntu 20.04.4 LTS 5.4.0-113-generic containerd://1.6.4
node-i Ready <none> 7h45m v1.24.0 192.168.22.6 <none> Ubuntu 20.04.4 LTS 5.4.0-113-generic containerd://1.6.4
node-k Ready <none> 7h46m v1.24.0 192.168.22.7 <none> Ubuntu 20.04.4 LTS 5.4.0-110-generic containerd://1.6.4
I create my topology and distribute it on two nodes
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
router-1 1/1 Running 0 11m 10.224.71.138 node-d <none> <none>
router-10 1/1 Running 0 11m 10.224.100.91 node-k <none> <none>
router-2 1/1 Running 0 11m 10.224.100.86 node-k <none> <none>
router-3 1/1 Running 0 11m 10.224.100.87 node-k <none> <none>
router-4 1/1 Running 0 11m 10.224.71.137 node-d <none> <none>
router-5 1/1 Running 0 11m 10.224.100.88 node-k <none> <none>
router-6 1/1 Running 0 11m 10.224.100.89 node-k <none> <none>
router-7 1/1 Running 0 11m 10.224.100.90 node-k <none> <none>
router-8 1/1 Running 0 11m 10.224.71.140 node-d <none> <none>
router-9 1/1 Running 0 11m 10.224.71.139 node-d <none> <none>
For example, Router4 should establish a neighbor relationship with R3, R5, and R7, but they are not established and cannot ping through each other
router-4# show ip ospf neighbor
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
10.224.100.87 1 Init/DROther 22m21s 38.205s 10.0.4.1 eth1:10.0.4.2 0 0 0
10.224.100.88 1 Init/DROther 22m21s 38.093s 10.0.7.2 eth2:10.0.7.1 0 0 0
router-4# ping 10.0.4.1
PING 10.0.4.1 (10.0.4.1): 56 data bytes
^C
--- 10.0.4.1 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
However, if all the pods are running on the same node and connected via veth, this is successful
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
10.224.15.148 1 Full/DR 3h40m52s 38.032s 10.0.4.1 eth1:10.0.4.2 0 0 0
10.224.15.149 1 Full/DR 3h40m46s 33.848s 10.0.8.2 eth3:10.0.8.1 0 0 0
10.224.219.86 1 Full/Backup 3h34m08s 31.814s 10.0.7.2 eth2:10.0.7.1 0 0 0
Can someone help me?
Can you elaborate on how you create your topology? Do you have a list of instructions to reproduce this?
thanks for your reply, here is my topology, and all configurations are here: https://github.com/kkgty/topo-ospf
apiVersion: v1
kind: List
items:
################### pod #######################
###### router-1 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-1
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-1
###### router-2 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-2
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-2
###### router-3 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-3
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-3
###### router-4 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-4
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-4
###### router-5 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-5
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-5
###### router-6 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-6
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-6
###### router-7 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-7
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-7
###### router-8 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-8
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-8
###### router-9 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-9
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-9
###### router-10 ######
- apiVersion: v1
kind: Pod
metadata:
name: router-10
labels:
name: tunnel
spec:
containers:
- name: tunnel
image: frrouting/frr:v8.2.2
resources:
limits:
memory: "128Mi"
cpu: "500m"
securityContext:
privileged: true
volumeMounts:
- name: config
mountPath: /etc/frr/daemons
subPath: daemons
- name: config
mountPath: /etc/frr/frr.conf
subPath: frr.conf
volumes:
- name: config
configMap:
name: router-10
################### topo #######################
###### router-1 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-1
spec:
links:
- uid: 2218012
peer_pod: router-2
local_intf: eth1
peer_intf: eth1
local_ip: 10.0.1.1/24
peer_ip: 10.0.1.2/24
- uid: 2218013
peer_pod: router-3
local_intf: eth2
peer_intf: eth1
local_ip: 10.0.2.1/24
peer_ip: 10.0.2.2/24
###### router-2 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-2
spec:
links:
- uid: 2218012
peer_pod: router-1
local_intf: eth1
peer_intf: eth1
local_ip: 10.0.1.2/24
peer_ip: 10.0.1.1/24
- uid: 2218023
peer_pod: router-3
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.3.1/24
peer_ip: 10.0.3.2/24
###### router-3 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-3
spec:
links:
- uid: 2218013
peer_pod: router-1
local_intf: eth1
peer_intf: eth2
local_ip: 10.0.2.2/24
peer_ip: 10.0.2.1/24
- uid: 2218023
peer_pod: router-2
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.3.2/24
peer_ip: 10.0.3.1/24
- uid: 2218034
peer_pod: router-4
local_intf: eth3
peer_intf: eth1
local_ip: 10.0.4.1/24
peer_ip: 10.0.4.2/24
- uid: 2218035
peer_pod: router-5
local_intf: eth4
peer_intf: eth1
local_ip: 10.0.5.1/24
peer_ip: 10.0.5.2/24
- uid: 2218036
peer_pod: router-6
local_intf: eth5
peer_intf: eth1
local_ip: 10.0.6.1/24
peer_ip: 10.0.6.2/24
###### router-4 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-4
spec:
links:
- uid: 2218034
peer_pod: router-3
local_intf: eth1
peer_intf: eth3
local_ip: 10.0.4.2/24
peer_ip: 10.0.4.1/24
- uid: 2218045
peer_pod: router-5
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.7.1/24
peer_ip: 10.0.7.2/24
- uid: 2218047
peer_pod: router-7
local_intf: eth3
peer_intf: eth1
local_ip: 10.0.8.1/24
peer_ip: 10.0.8.2/24
###### router-5 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-5
spec:
links:
- uid: 2218035
peer_pod: router-3
local_intf: eth1
peer_intf: eth4
local_ip: 10.0.5.2/24
peer_ip: 10.0.5.1/24
- uid: 2218045
peer_pod: router-4
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.7.2/24
peer_ip: 10.0.7.1/24
- uid: 2218058
peer_pod: router-8
local_intf: eth3
peer_intf: eth2
local_ip: 10.0.10.1/24
peer_ip: 10.0.10.2/24
###### router-6 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-6
spec:
links:
- uid: 2218036
peer_pod: router-3
local_intf: eth1
peer_intf: eth5
local_ip: 10.0.6.2/24
peer_ip: 10.0.6.1/24
- uid: 2218068
peer_pod: router-8
local_intf: eth2
peer_intf: eth1
local_ip: 10.0.9.1/24
peer_ip: 10.0.9.2/24
###### router-7 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-7
spec:
links:
- uid: 2218047
peer_pod: router-4
local_intf: eth1
peer_intf: eth3
local_ip: 10.0.8.2/24
peer_ip: 10.0.8.1/24
- uid: 2218078
peer_pod: router-8
local_intf: eth2
peer_intf: eth3
local_ip: 10.0.11.1/24
peer_ip: 10.0.11.2/24
###### router-8 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-8
spec:
links:
- uid: 2218058
peer_pod: router-5
local_intf: eth2
peer_intf: eth3
local_ip: 10.0.10.2/24
peer_ip: 10.0.10.1/24
- uid: 2218068
peer_pod: router-6
local_intf: eth1
peer_intf: eth2
local_ip: 10.0.9.2/24
peer_ip: 10.0.9.1/24
- uid: 2218078
peer_pod: router-7
local_intf: eth3
peer_intf: eth2
local_ip: 10.0.11.2/24
peer_ip: 10.0.11.1/24
- uid: 2218089
peer_pod: router-9
local_intf: eth5
peer_intf: eth1
local_ip: 10.0.13.1/24
peer_ip: 10.0.13.2/24
- uid: 2218080
peer_pod: router-10
local_intf: eth4
peer_intf: eth1
local_ip: 10.0.12.1/24
peer_ip: 10.0.12.2/24
###### router-9 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-9
spec:
links:
- uid: 2218089
peer_pod: router-8
local_intf: eth1
peer_intf: eth5
local_ip: 10.0.13.2/24
peer_ip: 10.0.13.1/24
- uid: 2218090
peer_pod: router-10
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.14.1/24
peer_ip: 10.0.14.2/24
###### router-10 ######
- apiVersion: networkop.co.uk/v1beta1
kind: Topology
metadata:
name: router-10
spec:
links:
- uid: 2218080
peer_pod: router-8
local_intf: eth1
peer_intf: eth4
local_ip: 10.0.12.2/24
peer_ip: 10.0.12.1/24
- uid: 2218090
peer_pod: router-9
local_intf: eth2
peer_intf: eth2
local_ip: 10.0.14.2/24
peer_ip: 10.0.14.1/24
Can you also provide the output of ip -d link show from inside of one of the routers with neighbors in Init/DROther state?
e.g. kubectl exec -it router-4 -- ip -d link show
Sure, here's the output from router4
node-d ➜ ~ kubectl exec -n tunnel router-4 -- ip -d link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 0 maxmtu 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
3: eth0@if178: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether e6:a1:b0:66:3f:99 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
188: eth3@if188: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 66:d8:31:8b:88:15 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 2223047 remote 192.168.22.7 dev if2 srcport 0 0 dstport 4789 l2miss l3miss ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
189: eth1@if189: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 7e:ca:8c:a9:70:11 brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 2223034 remote 192.168.22.7 dev if2 srcport 0 0 dstport 4789 l2miss l3miss ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
191: eth2@if191: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 86:1d:2c:1d:1d:dc brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 2223045 remote 192.168.22.7 dev if2 srcport 0 0 dstport 4789 l2miss l3miss ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
and router-5
node-d ➜ ~ kubectl exec -n tunnel router-5 -- ip -d link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 0 maxmtu 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
3: eth0@if194: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether e2:4b:59:44:b2:ac brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
205: eth1@if204: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether b2:be:cd:69:92:cd brd ff:ff:ff:ff:ff:ff link-netnsid 1 promiscuity 0 minmtu 68 maxmtu 65535
veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
211: eth2@if211: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 56:89:62:6e:79:ab brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 2223045 remote 192.168.22.3 dev eth0 srcport 0 0 dstport 4789 l2miss l3miss ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
215: eth3@if215: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 86:fc:ef:8b:1f:3e brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 minmtu 68 maxmtu 65535
vxlan id 2223058 remote 192.168.22.3 dev eth0 srcport 0 0 dstport 4789 l2miss l3miss ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
node-d ➜ ~ kubectl exec -n tunnel router-5 -- vtysh -c "show ip ospf neighbor"
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
10.224.100.87 1 Full/Backup 2h18m15s 39.314s 10.0.5.1 eth1:10.0.5.2 0 0 0
10.224.71.140 1 Full/Backup 2h18m20s 37.147s 10.0.10.2 eth3:10.0.10.1 0 0 0
Everything seems fine. I even deployed your topology locally and was able to see all adjacencies established. The only issue you've got is router-4 missing ip ospf area statement under its eth3 interface. But otherwise, everything looks correct.
The Init/DROther is a really weird state. It's as if the packets are being dropped in one direction. I'd suggest checking if you have any ACLs on the host (or somewhere in the network) that might drop UDP/VXLAN packets between hosts. Check the host logs for any errors during Pod deployment. As an experiment you can try and create a pair of VXLAN interfaces manually, give them an IP and try pinging between them. Other than that, I don't have any other ideas.
Thank you very much for your suggestions. I'll try to run some other tests.