vpp-agent
vpp-agent copied to clipboard
Ligato will add unrelated config on restart
Hello,
We noticed that for some reason ligato will add breaking config once restarted. On the first apply of config, via etcd, it will work correctly. But if we restart ligato, additional entries will be added that break our env.
config, which is applied via etcd using a script:
netallocConfig: {}
linuxConfig: {}
vppConfig:
vrfs:
- id: "4090"
label: "4090"
- id: "4091"
label: "4091"
- id: "10001"
label: "10001"
- id: "10002"
label: "10002"
interfaces:
- name: eth0
type: DPDK
enabled: true
phys_address: "e4:43:4b:e5:62:10"
ip_addresses:
- "10.10.12.50/24"
vrf: 4091
mtu: 9206
rx_modes:
- mode: "POLLING"
rx_placements: []
- name: Ext-0
type: DPDK
enabled: true
phys_address: "e4:43:4b:e5:62:11"
ip_addresses:
- "192.168.23.151/21"
vrf: 10001
mtu: 9206
rx_modes:
- mode: POLLING
rx_placements:
- worker: 1
- name: NCIC-0
type: DPDK
enabled: true
phys_address: "0e:92:1a:bc:2c:37"
mtu: 9206
rx_modes:
- mode: POLLING
- name: host-Vpp2Host
type: AF_PACKET
enabled: true
phys_address: "9a:dd:37:9b:ed:ad"
rx_modes:
- mode: INTERRUPT
rx_placements:
- worker: 1
afpacket:
host_if_name: Vpp2Host
# SUB INTERFACES
- name: Ext-0.2
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "10.88.99.1/29"
vrf: 10002
sub:
parent_name: Ext-0
sub_id: 2
- name: NCIC-0.1
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "169.254.1.1/29"
vrf: 10001
sub:
parent_name: NCIC-0
sub_id: 1
- name: NCIC-0.2
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "169.254.2.1/29"
vrf: 10002
sub:
parent_name: NCIC-0
sub_id: 2
- name: NCIC-0.4090
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "10.10.22.25/29"
vrf: 4090
sub:
parent_name: NCIC-0
sub_id: 4090
- name: NCIC-0.4091
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "10.10.22.9/29"
vrf: 4091
sub:
parent_name: NCIC-0
sub_id: 4091
- name: host-Vpp2Host.4090
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "10.10.22.1/29"
vrf: 4090
sub:
parent_name: host-Vpp2Host
sub_id: 4090
- name: host-Vpp2Host.4091
type: SUB_INTERFACE
enabled: true
ip_addresses:
- "10.10.22.33/29"
vrf: 4091
sub:
parent_name: host-Vpp2Host
sub_id: 4091
routes:
- type: INTRA_VRF
dst_network: 10.10.22.202/32
next_hop_addr: 10.10.22.26
vrf_id: 4090
outgoing_interface: NCIC-0.4090
- type: INTRA_VRF
dst_network: 10.10.23.193/32
next_hop_addr: 10.10.22.26
vrf_id: 4090
outgoing_interface: NCIC-0.4090
- type: INTRA_VRF
dst_network: 10.10.23.194/32
next_hop_addr: 10.10.22.26
vrf_id: 4090
outgoing_interface: NCIC-0.4090
- type: INTRA_VRF
dst_network: 10.10.25.210/32
next_hop_addr: 10.10.22.10
vrf_id: 4091
outgoing_interface: NCIC-0.4091
- type: INTRA_VRF
dst_network: 0.0.0.0/0
next_hop_addr: 10.10.12.100
vrf_id: 4091
outgoing_interface: eth0
- type: INTRA_VRF
dst_network: 0.0.0.0/0
next_hop_addr: 192.168.23.211
vrf_id: 10001
outgoing_interface: Ext-0
- type: INTRA_VRF
dst_network: 0.0.0.0/0
next_hop_addr: 10.88.99.2
vrf_id: 10002
outgoing_interface: Ext-0.2
- type: INTRA_VRF
dst_network: 10.10.22.198/32
next_hop_addr: 10.10.22.2
vrf_id: 4090
outgoing_interface: host-Vpp2Host.4090
- type: INTRA_VRF
dst_network: 10.10.22.199/32
next_hop_addr: 10.10.22.2
vrf_id: 4090
outgoing_interface: host-Vpp2Host.4090
- type: INTRA_VRF
dst_network: 10.10.22.200/32
next_hop_addr: 10.10.22.2
vrf_id: 4090
outgoing_interface: host-Vpp2Host.4090
- type: INTRA_VRF
dst_network: 0.0.0.0/0
next_hop_addr: 10.10.22.3
vrf_id: 4090
outgoing_interface: host-Vpp2Host.4090
- type: INTRA_VRF
dst_network: 10.10.25.193/32
next_hop_addr: 10.10.22.34
vrf_id: 4091
outgoing_interface: host-Vpp2Host.4091
- type: INTRA_VRF
dst_network: 10.10.25.194/32
next_hop_addr: 10.10.22.34
vrf_id: 4091
outgoing_interface: host-Vpp2Host.4091
- type: INTRA_VRF
dst_network: 10.10.25.197/32
next_hop_addr: 10.10.22.34
vrf_id: 4091
outgoing_interface: host-Vpp2Host.4091
- type: INTRA_VRF
dst_network: 10.10.25.198/32
next_hop_addr: 10.10.22.34
vrf_id: 4091
outgoing_interface: host-Vpp2Host.4091
Route entry on first apply:
vppctl show ip fib | grep 10.10.12.100 -A 2 -B 2
unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:18 buckets:1 uRPF:181 to:[45605437079:17598911219004]]
[0] [@5]: ipv4 via 10.10.12.100 eth0: mtu:9000 next:3 e4434bfcc774e4434be562100800
0.0.0.0/32
unicast-ip4-chain
--
[@0]: dpo-load-balance: [proto:ip4 index:81 buckets:1 uRPF:83 to:[0:0]]
[0] [@2]: dpo-receive: 10.10.12.50 on eth0
10.10.12.100/32
unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:154 buckets:1 uRPF:82 to:[9:756]]
[0] [@5]: ipv4 via 10.10.12.100 eth0: mtu:9000 next:3 e4434bfcc774e4434be562100800
10.10.12.255/32
unicast-ip4-chain
Route entry on ligato restart (docker restart ligato):
unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:8 buckets:2 uRPF:103 to:[0:0]]
[0] [@5]: ipv4 via 10.10.12.100 eth0: mtu:9000 next:4 e4434bfcc774e4434be562100800
[1] [@3]: arp-ipv4: via 192.168.23.211 Ext-0
0.0.0.0/32
--
[@0]: dpo-load-balance: [proto:ip4 index:81 buckets:1 uRPF:31 to:[0:0]]
[0] [@2]: dpo-receive: 10.10.12.50 on eth0
10.10.12.100/32
unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:90 buckets:1 uRPF:99 to:[0:0]]
[0] [@5]: ipv4 via 10.10.12.100 eth0: mtu:9000 next:4 e4434bfcc774e4434be562100800
10.10.12.255/32
unicast-ip4-chain
--
unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:53 buckets:2 uRPF:103 to:[26077:2329172]]
[0] [@5]: ipv4 via 10.10.12.100 eth0: mtu:9000 next:4 e4434bfcc774e4434be562100800
[1] [@3]: arp-ipv4: via 192.168.23.211 Ext-0
0.0.0.0/32
Note the added arp entry
Using goVPP directly and adding the route twice seems to work, indicating it is something wrong in ligato when constructing models.
Any idea where to look for the issue?
Can you run agentctl config get
and agentctl config retrieve
before and after restart? Also, please include logs for both runs. Thanks.
Will do! Our env is currently broken, but once I can repeat the setup I will add the info here.
Ok, done some more investigating
I have been using the rest api instead for these test. This flow will cause the issues:
- Create config
- Restart vpp-agent
- Create config again
At this point, using retreive will cause a crash
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1409039]
goroutine 137 [running]:
go.ligato.io/vpp-agent/v3/plugins/vpp/ifplugin/vppcalls/vpp2101.(*InterfaceVppHandler).dumpInterfaces(0xc000457550, 0x0, 0x0, 0x0, 0x203000, 0x4, 0x203000)
go.ligato.io/vpp-agent/v3/plugins/vpp/ifplugin/vppcalls/vpp2101/dump_interface_vppcalls.go:142 +0x5d9
go.ligato.io/vpp-agent/v3/plugins/vpp/ifplugin/vppcalls/vpp2101.(*InterfaceVppHandler).DumpInterfaces(0xc000457550, 0x1ff4dc0, 0xc0011966c0, 0x1bfbb80, 0xc000af7901, 0x14e690d)
go.ligato.io/vpp-agent/v3/plugins/vpp/ifplugin/vppcalls/vpp2101/dump_interface_vppcalls.go:175 +0x54
go.ligato.io/vpp-agent/v3/plugins/configurator.(*dumpService).DumpInterfaces(0x304adb8, 0x1ff4dc0, 0xc0011966c0, 0x0, 0x0, 0x0, 0x3, 0x0)
go.ligato.io/vpp-agent/v3/plugins/configurator/dump.go:204 +0x6d
go.ligato.io/vpp-agent/v3/plugins/configurator.(*dumpService).Dump(0x304adb8, 0x1ff4dc0, 0xc0011966c0, 0xc0011966f0, 0x0, 0x0, 0x0)
go.ligato.io/vpp-agent/v3/plugins/configurator/dump.go:81 +0x19d
go.ligato.io/vpp-agent/v3/plugins/configurator.(*configuratorServer).Dump(0x304adb8, 0x1ff4dc0, 0xc0011966c0, 0xc0011966f0, 0x304adb8, 0x2000be0, 0xc000d837c0)
go.ligato.io/vpp-agent/v3/plugins/configurator/configurator.go:56 +0x4b
go.ligato.io/vpp-agent/v3/proto/ligato/configurator._ConfiguratorService_Dump_Handler.func1(0x1ff4dc0, 0xc0011966c0, 0x1b0fb20, 0xc0011966f0, 0x2d, 0xc00120f310, 0x0, 0xc000af7b30)
go.ligato.io/vpp-agent/v3/proto/ligato/configurator/configurator_grpc.pb.go:217 +0x89
github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1(0x1ff4dc0, 0xc0011966c0, 0x1b0fb20, 0xc0011966f0, 0xc000ce3700, 0xc000ce3720, 0xc000c69ba0, 0x5cc246, 0x1c1f360, 0xc0011966c0)
github.com/grpc-ecosystem/[email protected]/server_metrics.go:107 +0xad
go.ligato.io/vpp-agent/v3/proto/ligato/configurator._ConfiguratorService_Dump_Handler(0x1d05ba0, 0x304adb8, 0x1ff4dc0, 0xc0011966c0, 0xc000ce4fc0, 0xc000a6d3b0, 0x1ff4dc0, 0xc0011966c0, 0x0, 0x0)
go.ligato.io/vpp-agent/v3/proto/ligato/configurator/configurator_grpc.pb.go:219 +0x150
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0000e8000, 0x2009a60, 0xc0011de480, 0xc000b86d00, 0xc0007150b0, 0x2f609a8, 0x0, 0x0, 0x0)
google.golang.org/[email protected]/server.go:1082 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc0000e8000, 0x2009a60, 0xc0011de480, 0xc000b86d00, 0x0)
google.golang.org/[email protected]/server.go:1405 +0xcc5
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000396cb0, 0xc0000e8000, 0x2009a60, 0xc0011de480, 0xc000b86d00)
google.golang.org/[email protected]/server.go:746 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
google.golang.org/[email protected]/server.go:744 +0xa5
config get will give this:
netallocConfig: {}
linuxConfig: {}
vppConfig:
interfaces:
- name: NCIC-0.4090
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 10.10.22.25/29
vrf: 4090
sub:
parentName: NCIC-0
subId: 4090
- name: host-Vpp2Host.4091
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 10.10.22.33/29
vrf: 4091
sub:
parentName: host-Vpp2Host
subId: 4091
- name: NCIC-0.1
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 169.254.1.1/29
vrf: 10001
sub:
parentName: NCIC-0
subId: 1
- name: NCIC-0.4091
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 10.10.22.9/29
vrf: 4091
sub:
parentName: NCIC-0
subId: 4091
- name: NCIC-0
type: DPDK
enabled: true
physAddress: 0e:92:1a:bc:2c:37
mtu: 9206
rxModes:
- mode: POLLING
- name: host-Vpp2Host
type: AF_PACKET
enabled: true
physAddress: 9a:dd:37:9b:ed:ad
rxModes:
- mode: INTERRUPT
rxPlacements:
- worker: 1
afpacket:
hostIfName: Vpp2Host
- name: eth0
type: DPDK
enabled: true
physAddress: e4:43:4b:e5:62:10
ipAddresses:
- 10.10.12.50/24
vrf: 4091
mtu: 9206
rxModes:
- mode: POLLING
- name: Ext-0.2
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 10.88.99.1/29
vrf: 10002
sub:
parentName: Ext-0
subId: 2
- name: Ext-0
type: DPDK
enabled: true
physAddress: e4:43:4b:e5:62:11
ipAddresses:
- 192.168.23.151/21
vrf: 10001
mtu: 9206
rxModes:
- mode: POLLING
rxPlacements:
- worker: 1
- name: NCIC-0.2
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 169.254.2.1/29
vrf: 10002
sub:
parentName: NCIC-0
subId: 2
- name: host-Vpp2Host.4090
type: SUB_INTERFACE
enabled: true
ipAddresses:
- 10.10.22.1/29
vrf: 4090
sub:
parentName: host-Vpp2Host
subId: 4090
routes:
- vrfId: 4091
dstNetwork: 10.10.25.197/32
nextHopAddr: 10.10.22.34
outgoingInterface: host-Vpp2Host.4091
- vrfId: 4091
dstNetwork: 10.10.25.210/32
nextHopAddr: 10.10.22.10
outgoingInterface: NCIC-0.4091
- vrfId: 4091
dstNetwork: 10.10.25.194/32
nextHopAddr: 10.10.22.34
outgoingInterface: host-Vpp2Host.4091
- vrfId: 4090
dstNetwork: 0.0.0.0/0
nextHopAddr: 10.10.22.3
outgoingInterface: host-Vpp2Host.4090
- vrfId: 4090
dstNetwork: 10.10.23.193/32
nextHopAddr: 10.10.22.26
outgoingInterface: NCIC-0.4090
- vrfId: 4090
dstNetwork: 10.10.22.200/32
nextHopAddr: 10.10.22.2
outgoingInterface: host-Vpp2Host.4090
- vrfId: 4091
dstNetwork: 10.10.25.193/32
nextHopAddr: 10.10.22.34
outgoingInterface: host-Vpp2Host.4091
- vrfId: 4090
dstNetwork: 10.10.23.194/32
nextHopAddr: 10.10.22.26
outgoingInterface: NCIC-0.4090
- vrfId: 4090
dstNetwork: 10.10.22.198/32
nextHopAddr: 10.10.22.2
outgoingInterface: host-Vpp2Host.4090
- vrfId: 4090
dstNetwork: 10.10.22.199/32
nextHopAddr: 10.10.22.2
outgoingInterface: host-Vpp2Host.4090
- vrfId: 10002
dstNetwork: 0.0.0.0/0
nextHopAddr: 10.88.99.2
outgoingInterface: Ext-0.2
- vrfId: 4091
dstNetwork: 10.10.25.198/32
nextHopAddr: 10.10.22.34
outgoingInterface: host-Vpp2Host.4091
- vrfId: 10001
dstNetwork: 0.0.0.0/0
nextHopAddr: 192.168.23.211
outgoingInterface: Ext-0
- vrfId: 4090
dstNetwork: 10.10.22.202/32
nextHopAddr: 10.10.22.26
outgoingInterface: NCIC-0.4090
- vrfId: 4091
dstNetwork: 0.0.0.0/0
nextHopAddr: 10.10.12.100
outgoingInterface: eth0
vrfs:
- id: 4090
label: "4090"
- id: 4091
label: "4091"
- id: 10001
label: "10001"
- id: 10002
label: "10002"
Worth mentioning, retrieve never works for us:
ERROR: rpc error: code = Unknown desc = l2_xconnect_dump crashes in VPP 21.01, dump will be skipped
Another thing worth mentioning, we run vpp on the host, mounting the sockets into the vpp-agent container. Could it be compability issues?