frr
frr copied to clipboard
BFD session in not established if interface recreation
Description
If interface recreation, bgp session in not established.
Version
10.2.1
How to reproduce
- Configure bgp from
neighbor 172.30.255.24 remote-as 4200220005
neighbor 172.30.255.24 bfd
neighbor 172.30.255.24 update-source qt-swep0
neighbor 172.30.255.24 timers 1 4
- Remove network interface
- Create network interface
Expected behavior
BFD session in not establish, if remove network interface or ip address, which used from update-source.
Actual behavior
BFD is establish if interface recriation
Additional context
- Debug log:
2025/01/12 10:10:26 BGP: [GPPQK-HK3ZM] bfd_get_peer_info: Can't find interface by ifindex: 18
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=843, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=844, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=847, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=848, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (280[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (279[226/280]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (282[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (281[228/282]) into the kernel
2025/01/12 10:10:27 BFD: [GCWEX-N0BBE] zclient: add interface qt-swep0 (VRF default(0))
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [SSYGJ-9ZAE0] zclient: add local address 172.30.255.25/31 (VRF 0)
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BGP: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Socket not connected
2025/01/12 10:10:28 BGP: [H4B4J-DCW2R][EC 33554455] 172.30.255.24 [Error] bgp_read_packet error: Connection reset by peer
2025/01/12 10:10:28 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::4c9:7fff:fe4e:fae2/64 (VRF 0)
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fec0:112:acab::5/127 (VRF 0)
2025/01/12 10:10:29 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:30 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:31 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
- sh bfd peer:
neigh# show bfd peer
BFD Peers:
peer 172.30.255.24 local-address 172.30.255.25 vrf default interface qt-swep0
ID: 2613937113
Remote ID: 2632334212
Active mode
Status: init
Diagnostics: path down
Remote diagnostics: control detection time expired
Peer Type: dynamic
RTT min/avg/max: 0/0/0 usec
Local timers:
Detect-multiplier: 3
Receive interval: 300ms
Transmission interval: 300ms
Echo receive interval: 50ms
Echo transmission interval: disabled
Remote timers:
Detect-multiplier: 3
Receive interval: 300ms
Transmission interval: 300ms
Echo receive interval: 50ms
- tcpdump:
~ # tcpdump -i qt-swep0 -ne
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on qt-swep0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:14:22.584911 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
10:14:23.335449 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
Checklist
- [X] I have searched the open issues for this bug.
- [X] I have not included sensitive information in this report.
What do you mean "Remove network interface"? no neighbor 172.30.255.24 update-source qt-swep0?
What do you mean "Remove network interface"?
no neighbor 172.30.255.24 update-source qt-swep0?
qt-swep0 - qemu tap interface
the interface removing if qemu process is stopping. Example:
/usr/bin/qemu-system-x86_64 -nodefaults -display none -netdev tap,ifname=qt-swep0,id=netdev0,script=/etc/qemu/ifup,downscript=/etc/qemu/ifdown -netdev socket,listen=192.168.252.5:42000,id=socket0 -netdev hubport,hubid=0,netdev=netdev0,id=hubport1 -netdev hubport,hubid=0,netdev=socket0,id=hubport
/etc/qemu/ifup:
#!/bin/sh
if [[ $1 == "qt-swep0" ]]; then
ip addr add 172.30.255.25/31 dev qt-swep0
fi
qt - interface type [qemu tunnel] swep - node name [numbers] - interface number
Is this a regression or it's the same with 10.0 or 10.1?
Is this a regression or it's the same with 10.0 or 10.1?
I suppose there is... Not tested...
I would be surprised if bfd handles the interface id # change appropriately once it is assigned. I suspect we would need to check a bunch of daemons behaviors. Not something that we are testing or attempting to do.
- The bug is not appers, if used gretap tunnel. Appers only tuntap tunnel...
- The bug maybe appers on non systemd systems...
My lab: FRR (10.2.1) [debian bookworm] - not appers FRR (10.1) [debian bookworm] - not appers FRR (10.0) [debian bookworm] - not appers Production [my vps]: alpine 3.21 (frr 10.2.1) - is appers.
Hast lab and production - debian bookworm.
This should be fixed in the latest releases, could you test it?