frr icon indicating copy to clipboard operation
frr copied to clipboard

BGP fails to determine nexthop routed directly to interface

Open vgrebenschikov opened this issue 11 months ago • 16 comments

Description

BGP route shown as "no best path" and next-hop shown as (inaccessible):

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Tue Mar 12 15:52:46 2024

while route is fine (Kernel), but it directly points to interface (point-to-point):

srv# show ip route 172.22.4.253/32
Routing entry for 172.22.4.253/32
  Known via "kernel", distance 0, metric 0, best
  Last update 00:02:37 ago
  * directly connected, wg0

or in system:

# route -n get 172.22.4.253/32
   route to: 172.22.4.253
destination: 172.22.4.253
        fib: 0
  interface: wg0
      flags: <UP,HOST,DONE,STATIC>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1420         1         0

adding the same static route does not change anything:

srv# conf t
srv(config)# ip route 172.22.4.253/32 wg0
srv(config)# 

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Tue Mar 12 15:55:03 2024

But, if I assign whole subnet to interface:

# ifconfig wg0 172.22.4.192/26

Everything works as expected:

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 7
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  172.22.4.253
  65022
    172.22.4.253 (metric 1) from 172.22.4.253 (172.22.9.1)
      Origin IGP, valid, external, best (First path received)
      Last update: Tue Mar 12 15:55:04 2024

Looks like there is a problem in next-hop availability algorythm.

FRRouting 8.5.4 (srv) on FreeBSD(14.0-RELEASE).

Version

# show version
FRRouting 8.5.4 (srv) on FreeBSD(14.0-RELEASE).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-vtysh' '--disable-doc-html' '--sysconfdir=/usr/local/etc/frr' '--localstatedir=/var/run/frr' '--disable-nhrpd' '--disable-pathd' '--disable-ospfclient' '--disable-pimd' '--disable-pbrd' '--with-vtysh-pager=cat' '--enable-backtrace' '--disable-config-rollbacks' '--disable-datacenter' '--enable-fpm' '--disable-ldpd' '--without-libpam' '--enable-rpki' '--disable-sharpd' '--disable-shell-access' '--disable-snmp' '--disable-tcmalloc' '--prefix=/usr/local' '--mandir=/usr/local/man' '--disable-silent-rules' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd14.0' 'build_alias=amd64-portbld-freebsd14.0' 'PKG_CONFIG=pkgconf' 'PKG_CONFIG_LIBDIR=/wrkdirs/usr/ports/net/frr8/work/.pkgconfig:/usr/local/libdata/pkgconfig:/usr/local/share/pkgconfig:/usr/libdata/pkgconfig' 'CC=cc' 'CFLAGS=-O2 -pipe -fstack-protector-strong -fno-strict-aliasing ' 'LDFLAGS= -L/usr/local/lib -L/usr/local/lib -fstack-protector-strong ' 'LIBS=' 'CPPFLAGS=-I/usr/local/include -I/usr/local/include' 'CPP=cpp' 'CXX=c++' 'CXXFLAGS=-O2 -pipe -fstack-protector-strong -fno-strict-aliasing ' 'PYTHON=/usr/local/bin/python3.9'

How to reproduce

use any point-to-point interface without sub-net (i.e. wg0) to make BGP session

Expected behavior

Valid direct route to interface should be accounted for installing BGP routes to the interface
as:

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 7
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  172.22.4.253
  65022
    172.22.4.253 (metric 1) from 172.22.4.253 (172.22.9.1)
      Origin IGP, valid, external, best (First path received)
      Last update: Tue Mar 12 15:55:04 2024

Actual behavior

BGP route is not installed:

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Tue Mar 12 15:52:46 2024

Additional context

No response

Checklist

  • [X] I have searched the open issues for this bug.
  • [X] I have not included sensitive information in this report.

vgrebenschikov avatar Mar 12 '24 21:03 vgrebenschikov

Could you try enabling ip nht resolve-via-default?

ton31337 avatar Mar 13 '24 12:03 ton31337

Could you try enabling ip nht resolve-via-default

yep, the same:

srv# conf t
srv(config)# ip nht resolve-via-default
srv(config)#
srv#
srv# clear ip bgp *

srv# show ip bgp neighbors 172.22.4.253 received
BGP table version is 3, local router ID is 172.22.1.5, vrf id 0
Default local pref 100, local AS 65021
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *> 172.22.0.0/19    172.22.4.253                           0 65022 65021 i
 *> 172.22.9.0/24    172.22.4.253                           0 65022 i
 *> 172.22.19.0/24   172.22.4.253                           0 65022 65023 i
 *> 172.22.20.0/24   172.22.4.253                           0 65022 i
 *> 172.22.21.0/24   172.22.4.253                           0 65022 i
 *> 172.22.24.0/24   172.22.4.253                           0 65022 i
 *> 172.23.0.0/16    172.22.4.253                           0 65022 65021 i
 *> 172.24.1.0/24    172.22.4.253                           0 65022 65021 i

Total number of prefixes 8

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Wed Mar 13 09:34:02 2024
srv#

vgrebenschikov avatar Mar 13 '24 17:03 vgrebenschikov

Could you show show ip nht? And also try enabling debug debug bgp nht.

ton31337 avatar Mar 13 '24 20:03 ton31337

srv# show ip nht
172.22.2.1
 resolved via connected
 is directly connected, re0 (vrf default)
 Client list: static(fd 27)
172.22.4.253(Connected)
 unresolved(Connected)
 Client list: bgp(fd 32)

and bgpd.log:

2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 0 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 0 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 0 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [VKMV1-4Y773] bgp_update(172.22.4.253): NH unresolved
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 1 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [VKMV1-4Y773] bgp_update(172.22.4.253): NH unresolved
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 2 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [VKMV1-4Y773] bgp_update(172.22.4.253): NH unresolved
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 3 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [VKMV1-4Y773] bgp_update(172.22.4.253): NH unresolved
2024/03/13 13:31:23 BGP: [WNKP5-SN018] Found existing bnc 172.22.4.253/32(0)(VRF default) flags 0xe ifindex 0 #paths 4 peer 0x28da947a4580
2024/03/13 13:31:23 BGP: [VKMV1-4Y773] bgp_update(172.22.4.253): NH unresolved

anyway, default is wrong direction for that route, it should point to wg0

vgrebenschikov avatar Mar 13 '24 21:03 vgrebenschikov

@vgrebenschikov can we get the following outputs also:

show bgp nexthop
show bgp import-check-table
show ip import-check

ton31337 avatar Apr 04 '24 20:04 ton31337

@vgrebenschikov can you post the wireguard config (minus any keys of course) ?

beith12 avatar Apr 12 '24 17:04 beith12

can you please provide the configuration topology for recreation of this bug.

tera2603 avatar Jun 27 '24 10:06 tera2603

Even if a fix were made to propagate directly connected routes (where the outgoing interface has no IP address assigned) from the kernel to the BGP routing table, it would not enable sending traffic in the reverse direction. In a Data Center environment where traffic is predominantly TCP-based, bidirectional traffic must be expected. Therefore, an interface such as wg0 without an IP address configured would be considered incomplete from the Data Center perspective.

nandini660 avatar Jun 27 '24 13:06 nandini660

Even if a fix were made to propagate directly connected routes (where the outgoing interface has no IP address assigned) from the kernel to the BGP routing table, it would not enable sending traffic in the reverse direction. In a Data Center environment where traffic is predominantly TCP-based, bidirectional traffic must be expected. Therefore, an interface such as wg0 without an IP address configured would be considered incomplete from the Data Center perspective.

It has IP address assigned, but, it is not "broadcast" interface, so, it, like any onther P2P interface, has address on our end and routes into interface, that it.

Similar problem with tun interface for example.

Kernel is very certan on this - a. there are direct interface routes: "packets with dst in prefix sent to interface directly" b. there are routes with next hop: "packets with dst in prefix sent to next-hop connected via interface"

somehow we have lost scenario a. above for BGP ...

vgrebenschikov avatar Jun 27 '24 14:06 vgrebenschikov

can you please provide the configuration topology for recreation of this bug.

topology is trivial, just two hosts connected with wireguard, and expected that BGP session will work over the tunnel.

image

vgrebenschikov avatar Jun 27 '24 14:06 vgrebenschikov

can you please provide the configuration topology for recreation of this bug.

topology is trivial, just two hosts connected with wireguard, and expected that BGP session will work over the tunnel.

image

can you please provide configuration for better understanding.

tera2603 avatar Jun 27 '24 15:06 tera2603

can you please provide the configuration topology for recreation of this bug.

can you please provide configuration for better understanding.

# ifconfig wg0
wg0: flags=10080c1<UP,RUNNING,NOARP,MULTICAST,LOWER_UP> metric 0 mtu 1420
	options=80000<LINKSTATE>
	inet 172.22.4.192 netmask 0xffffffff

# netstat -rn | fgrep wg0
172.22.4.253       link#6             UHS         wg0

# ping -c1 172.22.4.253
PING 172.22.4.253 (172.22.4.253): 56 data bytes
64 bytes from 172.22.4.253: icmp_seq=0 ttl=64 time=74.567 ms

# vtysh -e 'show run'
...
router bgp 65021
 no bgp ebgp-requires-policy
 no bgp network import-check
 neighbor 172.22.4.253 remote-as 65022
 neighbor 172.22.4.253 interface wg0
 neighbor 172.22.4.253 update-source wg0

# vtysh 
srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Thu Jun 27 19:17:33 2024

Notice:

  1. wg interface has /32 prefix assigned
  2. that FRR does not even notice valid and working connected route 172.22.4.253/32 and fallback to bigger network (see below)
  3. that wg0 interface has no BROADCAST in interface flags and there is NOARP - so FRR's assumption that there should be next-hop for wg0 routes is invalid.
  4. BGP session is ok, but routes are inaccessible
# route -n get 172.22.4.253
   route to: 172.22.4.253
destination: 172.22.4.253
        fib: 0
  interface: wg0
      flags: <UP,HOST,DONE,STATIC>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1420         1         0

# vtysh -e 'show ip route 172.22.4.253'
Routing entry for 172.22.0.0/16
  Known via "ospf", distance 110, metric 120, best
  Last update 00:06:56 ago
  * 172.22.2.1, via em0, weight 1

What will fix situation - assignment of the subnet which will include other end of tunnel on wg0 interaface, now FRR think that other end is reachable, but it fact it was reachable before:

# ifconfig wg0 172.22.4.192/26

# vtysh -e 'show ip route 172.22.4.253'
Routing entry for 172.22.4.192/26
  Known via "connected", distance 0, metric 1, best
  Last update 00:01:01 ago
  * directly connected, wg0

# vtysh -e 'show ip bgp 172.22.9.0/24'
BGP routing table entry for 172.22.9.0/24, version 6
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  172.22.4.253
  65022
    172.22.4.253 (metric 1) from 172.22.4.253 (172.22.9.1)
      Origin IGP, valid, external, best (First path received)
      Last update: Thu Jun 27 19:17:33 2024

probably, the problem is connected with th issue #9185

vgrebenschikov avatar Jun 27 '24 16:06 vgrebenschikov

Is it possible to test this with the latest releases?

ton31337 avatar Jul 04 '24 08:07 ton31337

Is it possible to test this with the latest releases?

Tested on 10.0 - situation the same

srv# show ip bgp 172.22.9.0/24
BGP routing table entry for 172.22.9.0/24, version 5
Paths: (1 available, no best path)
  Advertised to non peer-group peers:
  172.22.4.251 172.22.4.253
  65022
    172.22.4.253 (inaccessible) from 172.22.4.253 (172.22.9.1)
      Origin IGP, invalid, external
      Last update: Tue Jul  9 21:03:33 2024
srv#

vgrebenschikov avatar Jul 09 '24 18:07 vgrebenschikov

Could you show show ip route 172.22.9.0/24 json?

ton31337 avatar Sep 02 '24 13:09 ton31337