frr icon indicating copy to clipboard operation
frr copied to clipboard

BFD session in not established if interface recreation

Open ne-vlezay80 opened this issue 10 months ago • 8 comments

Description

If interface recreation, bgp session in not established.

Version

10.2.1

How to reproduce

  1. Configure bgp from
 neighbor 172.30.255.24 remote-as 4200220005
 neighbor 172.30.255.24 bfd
 neighbor 172.30.255.24 update-source qt-swep0
 neighbor 172.30.255.24 timers 1 4
  1. Remove network interface
  2. Create network interface

Expected behavior

BFD session in not establish, if remove network interface or ip address, which used from update-source.

Actual behavior

BFD is establish if interface recriation

Additional context

  1. Debug log:
2025/01/12 10:10:26 BGP: [GPPQK-HK3ZM] bfd_get_peer_info: Can't find interface by ifindex: 18 
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=843, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=844, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid device index
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=847, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [HSYZM-HV7HF] Extended Error: Invalid nexthop id
2025/01/12 10:10:27 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=848, pid=4199702981
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (280[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (279[226/280]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (282[]) into the kernel
2025/01/12 10:10:27 ZEBRA: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (281[228/282]) into the kernel
2025/01/12 10:10:27 BFD: [GCWEX-N0BBE] zclient: add interface qt-swep0 (VRF default(0))
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [SSYGJ-9ZAE0] zclient: add local address 172.30.255.25/31 (VRF 0)
2025/01/12 10:10:27 BFD: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025/01/12 10:10:27 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BGP: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Socket not connected
2025/01/12 10:10:28 BGP: [H4B4J-DCW2R][EC 33554455] 172.30.255.24 [Error] bgp_read_packet error: Connection reset by peer
2025/01/12 10:10:28 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fe80::4c9:7fff:fe4e:fae2/64 (VRF 0)
2025/01/12 10:10:28 BFD: [SSYGJ-9ZAE0] zclient: add local address fec0:112:acab::5/127 (VRF 0)
2025/01/12 10:10:29 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:30 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
2025/01/12 10:10:31 BFD: [YA0Q5-C0BPV] control-packet: 'remote discriminator' is zero, not overridden [mhop:no peer:172.30.255.24 local:172.30.255.25 port:19]
  1. sh bfd peer:
neigh# show bfd peer
BFD Peers:
        peer 172.30.255.24 local-address 172.30.255.25 vrf default interface qt-swep0
                ID: 2613937113
                Remote ID: 2632334212
                Active mode
                Status: init
                Diagnostics: path down
                Remote diagnostics: control detection time expired
                Peer Type: dynamic
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
  1. tcpdump:
~ # tcpdump -i qt-swep0 -ne
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on qt-swep0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:14:22.584911 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
10:14:23.335449 2a:9a:e0:00:14:40 > 06:c9:7f:4e:fa:e2, ethertype IPv4 (0x0800), length 66: 172.30.255.24.65263 > 172.30.255.25.3784: BFDv1, Control, State Down, Flags: [none], length: 24
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel

Checklist

  • [X] I have searched the open issues for this bug.
  • [X] I have not included sensitive information in this report.

ne-vlezay80 avatar Jan 12 '25 10:01 ne-vlezay80

What do you mean "Remove network interface"? no neighbor 172.30.255.24 update-source qt-swep0?

ton31337 avatar Jan 12 '25 21:01 ton31337

What do you mean "Remove network interface"? no neighbor 172.30.255.24 update-source qt-swep0?

qt-swep0 - qemu tap interface

the interface removing if qemu process is stopping. Example:

/usr/bin/qemu-system-x86_64 -nodefaults -display none -netdev tap,ifname=qt-swep0,id=netdev0,script=/etc/qemu/ifup,downscript=/etc/qemu/ifdown -netdev socket,listen=192.168.252.5:42000,id=socket0 -netdev hubport,hubid=0,netdev=netdev0,id=hubport1 -netdev hubport,hubid=0,netdev=socket0,id=hubport

/etc/qemu/ifup:

#!/bin/sh

if [[ $1 == "qt-swep0" ]]; then
 ip addr add 172.30.255.25/31 dev qt-swep0
fi

qt - interface type [qemu tunnel] swep - node name [numbers] - interface number

ne-vlezay80 avatar Jan 12 '25 21:01 ne-vlezay80

Is this a regression or it's the same with 10.0 or 10.1?

ton31337 avatar Jan 14 '25 07:01 ton31337

Is this a regression or it's the same with 10.0 or 10.1?

I suppose there is... Not tested...

ne-vlezay80 avatar Jan 14 '25 10:01 ne-vlezay80

I would be surprised if bfd handles the interface id # change appropriately once it is assigned. I suspect we would need to check a bunch of daemons behaviors. Not something that we are testing or attempting to do.

donaldsharp avatar Jan 15 '25 14:01 donaldsharp

  1. The bug is not appers, if used gretap tunnel. Appers only tuntap tunnel...
  2. The bug maybe appers on non systemd systems...

ne-vlezay80 avatar Jan 17 '25 06:01 ne-vlezay80

My lab: FRR (10.2.1) [debian bookworm] - not appers FRR (10.1) [debian bookworm] - not appers FRR (10.0) [debian bookworm] - not appers Production [my vps]: alpine 3.21 (frr 10.2.1) - is appers.

Hast lab and production - debian bookworm.

ne-vlezay80 avatar Jan 17 '25 06:01 ne-vlezay80

This should be fixed in the latest releases, could you test it?

ton31337 avatar Jun 10 '25 05:06 ton31337