frr icon indicating copy to clipboard operation
frr copied to clipboard

Wierd BGP IPv6 ll nh behavior

Open qeleq opened this issue 1 year ago • 27 comments

Hi All!. FRR version 10.0.

I have two interfaces with ipv6 ll addresses and EBGP IPv6 sessions

7: ens13f0np0.80@ens13f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 18:9b:a5:82:25:e2 brd ff:ff:ff:ff:ff:ff
    inet6 fe80:14:fc01:1::2/64 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::1a9b:a5ff:fe82:25e2/64 scope link 
       valid_lft forever preferred_lft forever
10: ens28f0np0.80@ens28f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e8:eb:d3:b3:54:b6 brd ff:ff:ff:ff:ff:ff
    inet6 fe80:14:fc01:2::2/64 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::eaeb:d3ff:feb3:54b6/64 scope link 
       valid_lft forever preferred_lft forever

FRR settings

_frr version 10.0
frr defaults traditional
hostname el-fw1.cdnwb.ru
log syslog informational
service integrated-vtysh-config

router bgp 65323
neighbor SW-LAN peer-group
 neighbor fe80:14:fc01:1::1 peer-group SW-LAN
 neighbor fe80:14:fc01:1::1 interface ens13f0np0.80
 no neighbor fe80:14:fc01:1::1 enforce-first-as
 neighbor fe80:14:fc01:2::1 peer-group SW-LAN
 neighbor fe80:14:fc01:2::1 interface ens28f0np0.80
 no neighbor fe80:14:fc01:2::1 enforce-first-as
 address-family ipv6 unicast
  neighbor SW-LAN activate
  neighbor SW-LAN soft-reconfiguration inbound
  neighbor SW-LAN route-map FROM_LAN_V6 in
  neighbor SW-LAN route-map TO_LAN_V6 out
 exit-address-family_

All sessions are UP and stable

_Neighbor          V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
fe80:14:fc01:1::1 4      65322      2400      2170       11    0    0 16:27:31            1        0 N/A
fe80:14:fc01:2::1 4      65322      2362      2142       11    0    0 16:27:31            1        0 N/A_

Both BGP peer announce me one IPv6 prefix, 2a03:720:1000::/36

el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:1::1 received-routes

_BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

  Network          Next Hop            Metric LocPrf Weight Path
 *> 2a03:720:1000::/36
                    fe80:14:fc01:1::1
                                                           0 65322 4206000170 57073 i
Total number of prefixes 1_

el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:2::1 received-routes

BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

  Network          Next Hop            Metric LocPrf Weight Path
 *> 2a03:720:1000::/36
                    fe80:14:fc01:2::1
                                                           0 65322 4206000170 57073 i

Total number of prefixes 1

So, BGP signaling is ok, but i have very weird situation for adding routes to RIB. So

_el-fw1.cdnwb.ru# sh bgp neighbors fe80:14:fc01:1::1 received-routes detail

BGP table version is 11, local router ID is 192.168.0.1, vrf id 0
Default local pref 100, local AS 65323
BGP routing table entry for 2a03:720:1000::/36, version 11
Paths: (2 available, best #1, table default)
  Not advertised to any peer
  65322 4206000170 57073
    **fe80:14:fc01:2::1** from **fe80:14:fc01:2::1** (10.255.193.111)
    (fe80:14:fc01:2::1) (used)
      Origin IGP, valid, external, best (First path received)
      Last update: Mon May 27 17:45:41 2024
  65322 4206000170 57073
    **fe80:14:fc01:1::1** (inaccessible, import-check enabled) from **fe80:14:fc01:1::1** (10.255.193.110)
    (fe80:14:fc01:1::1) (used)
      Origin IGP, invalid, external
      Last update: Mon May 27 17:45:41 2024

Total number of prefixes 1_

Question number 1 why route from peer fe80:14:fc01:2::1 is shown as route from peer fe80:14:fc01:1::1 And the second question is probably related to the first, i have a big problem with installing route to the RIB. Some time i have both routes

_B>* 2a03:720:1000::/36 [20/0] via **fe80:14:fc01:1::1,** ens13f0np0.80, weight 1, 00:11:59
**                via **fe80:14:fc01:2::1**, ens28f0np0.80, weight 1, 00:11:59_

Sometimes one

_B>* 2a03:720:1000::/36 [20/0] via fe80:14:fc01:2::1, ens28f0np0.80, weight 1, 16:38:59_

Some times none :-(

Help me please.

qeleq avatar May 28 '24 08:05 qeleq

Can you enable debug bgp updates, debug bgp neighbor, debug bgp nht and then send us the logs?

Also, just in case the following commands outputs would be handy too:

show ipv6 nht
show bgp nexthop
show bgp import-check-table

ton31337 avatar May 28 '24 15:05 ton31337

Done

VRF default:
 Resolve via default: on
fe80:14:fc01:1::1(Connected)
 resolved via connected
 is directly connected, ens13f0np0.80 (vrf default)
 Client list: bgp(fd 18)
fe80:14:fc01:2::1(Connected)
 resolved via connected
 is directly connected, ens28f0np0.80 (vrf default)
 Client list: bgp(fd 18)
el-fw1.cdnwb.ru# show bgp nexthop
Current BGP nexthop cache:
 fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
  if ens13f0np0.80
  Last update: Mon May 27 16:26:58 2024
 fe80:14:fc01:2::1 valid [IGP metric 0], #paths 1, peer fe80:14:fc01:2::1
  if ens28f0np0.80
  Last update: Mon May 27 16:35:25 2024
 fe80:14:fc01:1::1 invalid, #paths 1
  Must be Connected
  Last update: Wed May 22 17:20:29 2024
el-fw1.cdnwb.ru# show bgp import-check-table
Current BGP import check cache:
el-fw1.cdnwb.ru#_ 

debug.txt

qeleq avatar May 29 '24 07:05 qeleq

You have something strange in next-hop cache:

 fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
  if ens13f0np0.80
  Last update: Mon May 27 16:26:58 2024

 fe80:14:fc01:1::1 invalid, #paths 1
  Must be Connected
  Last update: Wed May 22 17:20:29 2024

Two entries for the same next-hop, but one is invalid. And the last update is way older. Does this happens (bad behavior) even when the router is restarted? Or is that starting to happen after some time?

ton31337 avatar May 29 '24 20:05 ton31337

I dont know its related or not. I have similar issue like this after restore config from 9.1 to 10.0 (which is enforce-first-as as default). Triggering command with no neighbor XXX enforce-first-as bring still showing weird low number of received-routes. Clear ip bgp also not works until solved by neighbor XXX shutdown and no shutdown.

So command no neighbor XXX enforce-first-as need shut and no shut the peer then the command will aplied.

ahmdzaki18 avatar May 29 '24 22:05 ahmdzaki18

You have something strange in next-hop cache:

 fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1
  if ens13f0np0.80
  Last update: Mon May 27 16:26:58 2024

 fe80:14:fc01:1::1 invalid, #paths 1
  Must be Connected
  Last update: Wed May 22 17:20:29 2024

Two entries for the same next-hop, but one is invalid. And the last update is way older. Does this happens (bad behavior) even when the router is restarted? Or is that starting to happen after some time?

It's a new router with new ipv6 design. A have got this problem just after the frr and host configurations were completed. There was one period when everything was working, about 15 minutes. It seems to me that after the restart FRR the situation may change. Both nexthops can become invalid, for example, or both can work, anything is possible. By the way, now nh table is el-fw1.cdnwb.ru# sh bgp nexthop Current BGP nexthop cache: fe80:14:fc01:1::1 valid [IGP metric 0], #paths 0, peer fe80:14:fc01:1::1 if ens13f0np0.80 Last update: Mon May 27 16:26:58 2024 fe80:14:fc01:2::1 valid [IGP metric 0], #paths 1, peer fe80:14:fc01:2::1 if ens28f0np0.80 Last update: Mon May 27 16:35:25 2024 fe80:14:fc01:1::1 invalid, #paths 1 Must be Connected Last update: Wed May 22 17:20:29 2024 el-fw1.cdnwb.ru#

qeleq avatar May 30 '24 07:05 qeleq

I dont know its related or not. I have similar issue like this after restore config from 9.1 to 10.0 (which is enforce-first-as as default). Triggering command with no neighbor XXX enforce-first-as bring still showing weird low number of received-routes. Clear ip bgp also not works until solved by neighbor XXX shutdown and no shutdown.

So command no neighbor XXX enforce-first-as need shut and no shut the peer then the command will aplied.

Sorry, it didn't help me

qeleq avatar May 30 '24 07:05 qeleq

Could you also show "show ipv6 route"?

ton31337 avatar Jul 03 '24 10:07 ton31337

I have a similar issue:

version

FRRouting 10.1.1 (frr-10.1.1) on Linux(6.6.52-0-virt).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--localstatedir=/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--libdir=/usr/lib/frr' '--with-moduledir=/usr/lib/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'CC=gcc' 'CXX=g++' 'PYTHON=python3'

Router configs

r1:

hostname r1
ip router-id 1.1.1.1
!
interface eth0
 ipv6 address fd00:1111::1/48
 ipv6 address fe80::1111/64
exit
!
router bgp 1
 neighbor fe80::2222 remote-as 2
 neighbor fe80::2222 interface eth0
 !
 address-family ipv6 unicast
  network fd00:1111::/48
  neighbor fe80::2222 activate
  neighbor fe80::2222 route-map map in
  neighbor fe80::2222 route-map map out
 exit-address-family
exit
!
route-map map permit 1
exit

r2:

hostname r2
ip router-id 2.2.2.2
!
interface eth0
 ipv6 address fd00:2222::1/48
 ipv6 address fe80::2222/64
exit
!
router bgp 2
 neighbor fe80::1111 remote-as 1
 neighbor fe80::1111 interface eth0
 !
 address-family ipv6 unicast
  network fd00:2222::/48
  neighbor fe80::1111 activate
  neighbor fe80::1111 route-map map in
  neighbor fe80::1111 route-map map out
 exit-address-family
exit
!
route-map map permit 1
exit

More information

r1:

r1# show bgp 
BGP table version is 1, local router ID is 1.1.1.1, vrf id 0
Default local pref 100, local AS 1
Status codes:  s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>  fd00:1111::/48   ::                       0         32768 i
     fd00:2222::/48   fe80::2222               0             0 2 i

Displayed 2 routes and 2 total paths
r1# show bgp fd00:2222::/48
BGP routing table entry for fd00:2222::/48, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  2
    fd00:2222::1 (inaccessible, import-check enabled) from fe80::2222 (2.2.2.2)
    (fe80::2222) (used)
      Origin IGP, metric 0, invalid, external
      Last update: Fri Sep 27 19:49:23 2024
r1# show bgp nexthop 
Current BGP nexthop cache:
 fe80::2222 valid [IGP metric 0], #paths 0, peer fe80::2222
  Resolved prefix fe80::/64
  if eth0
  Last update: Fri Sep 27 19:46:23 2024
 fe80::2222 invalid, #paths 1
  Must be Connected
  Last update: Fri Sep 27 19:45:41 2024
r1# ping fe80::2222%eth0
PING fe80::2222%eth0 (fe80::2222%2): 56 data bytes
64 bytes from fe80::2222: seq=0 ttl=64 time=0.780 ms
64 bytes from fe80::2222: seq=1 ttl=64 time=1.565 ms
64 bytes from fe80::2222: seq=2 ttl=64 time=0.988 ms
64 bytes from fe80::2222: seq=3 ttl=64 time=1.255 ms
r1# show ipv6 route
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

C>* fd00:1111::/48 is directly connected, eth0, 00:15:08
L>* fd00:1111::1/128 is directly connected, eth0, 00:15:08
C>* fe80::/64 is directly connected, eth0, 00:15:25

r2:

r2# show bgp 
BGP table version is 2, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes:  s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>  fd00:1111::/48   fe80::1111               0             0 1 i
 *>  fd00:2222::/48   ::                       0         32768 i

Displayed 2 routes and 2 total paths
r2# show bgp fd00:1111::/48
BGP routing table entry for fd00:1111::/48, version 2
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  fe80::1111
  1
    fd00:1111::1 from fe80::1111 (1.1.1.1)
    (fe80::1111) (used)
      Origin IGP, metric 0, valid, external, best (First path received)
      Last update: Fri Sep 27 19:49:23 2024
r2# show bgp nexthop
Current BGP nexthop cache:
 fe80::1111 valid [IGP metric 0], #paths 1, peer fe80::1111
  Resolved prefix fe80::/64
  if eth0
  Last update: Fri Sep 27 19:47:21 2024
r2# ping fe80::1111%eth0
PING fe80::1111%eth0 (fe80::1111%2): 56 data bytes
64 bytes from fe80::1111: seq=0 ttl=64 time=0.516 ms
64 bytes from fe80::1111: seq=1 ttl=64 time=0.740 ms
64 bytes from fe80::1111: seq=2 ttl=64 time=1.444 ms
64 bytes from fe80::1111: seq=3 ttl=64 time=1.616 ms
r2# show ipv6 route
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIPng, O - OSPFv3, I - IS-IS, B - BGP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

B>* fd00:1111::/48 [20/0] via fe80::1111, eth0, weight 1, 00:12:06
C>* fd00:2222::/48 is directly connected, eth0, 00:14:07
L>* fd00:2222::1/128 is directly connected, eth0, 00:14:07
C>* fe80::/64 is directly connected, eth0, 00:15:22

famfo avatar Sep 27 '24 20:09 famfo

Bumping this as I am also seeing this issue. LL next hops are being shown as inaccessible on one of the peers despite being accessible. frr version 10.2

asdfjkluiop avatar Nov 20 '24 02:11 asdfjkluiop

Any updates on this? (ping @ton31337, you asked for additional debug output)

famfo avatar Jan 04 '25 19:01 famfo

Could you try with disable-connected-check for these neighbors? But I need debug logs still, could you give us here also? To compare with the other debug we have already.

Also, to make it clear, could you disable ipv4 address family (no neighbor xxx under address-family ipv4 unicast) for this neighbor and try? If the issue is gone, then the root cause is clear.

ton31337 avatar Jan 04 '25 19:01 ton31337

Attached the debug information including logging for:

BGP debugging status:
  BGP neighbor-events debugging is on
  BGP next-hop tracking debugging is on
  BGP updates debugging is on (inbound)
  BGP updates debugging is on (outbound)

Note: these is still an older frr versions, I can update them later.

r1.txt r2.txt

famfo avatar Jan 04 '25 20:01 famfo

Ok, but please try with what I wrote above first.

ton31337 avatar Jan 04 '25 21:01 ton31337

I have not collected any debug logs but I have IPv4 unicast disabled entirely and just tried disabling the connected check on my peer but at least for my case it didn't make any difference.

asdfjkluiop avatar Jan 04 '25 21:01 asdfjkluiop

I have the running configuration at the end of the log. I have set configuration options outlined earlier (example from r1):

neighbor fe80::2222 disable-connected-check
address-family ipv4 unicast
  no neighbor fe80::2222 activate
 exit-address-family

famfo avatar Jan 04 '25 21:01 famfo

Okey, thanks, I will bootstrap the things with your configurations.

ton31337 avatar Jan 05 '25 12:01 ton31337

@famfo was it okay with 9.1? Or any lower version.

ton31337 avatar Jan 06 '25 14:01 ton31337

Yes, I don't remember the exact version when it stopped working

famfo avatar Jan 06 '25 15:01 famfo

#14818 looks related

famfo avatar Jan 06 '25 15:01 famfo

I have just encountered the same issue as @famfo, after VyOS bumped their FRR version from 9.1 to 10.2.

So I'm running 10.2 on one side (VyOS), talking to 8.5.6 on the other side (OPNsense), using fe80:: link-local addresses and they are not connecting.

The 8.5.6 side get a connection reset every 6 minutes:

fe80::101 [Error] bgp_read_packet error: Connection reset by peer

FRR 9.1 works fine, so I downgraded to that for now.

itz-Jana avatar Jan 16 '25 13:01 itz-Jana

I can confirm I am seeing similar behaviour on Vyos 1.5-rolling-202412100007 which is running FRR version 9.1.2:

~$ show version frr
FRRouting 9.1.2 (xxxx) on Linux(6.6.64-vyos).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

I outlined the issue with some debug logs in https://vyos.dev/T7061

For me, the issue is over Wireguard tunnel interfaces, although I have not tried with non-Wireguard interfaces. My IPv6 LL's are marked as inaccessible even though I can ping the next-hop IP address. The issue is fixed by either a reboot or, oddly enough, doing a tcpdump on the tunnel interface.

The issue is easily reproducible for me by resetting the BGP peer.

EDIT: I have tried rolling back to a Vyos 1.4 version which uses FRR version 9.1 and have not been able to reproduce the issue:

~$ show ver frr
FRRouting 9.1 (xxxx) on Linux(6.6.21-amd64-vyos).

sshaun10 avatar Jan 21 '25 23:01 sshaun10

I had also created a basic issue on VyOS https://vyos.dev/T7055

itz-Jana avatar Jan 22 '25 12:01 itz-Jana

We at metal-stack.io face the same issue, and have seen some improvements by add no bgp enforce-first-as, but still not sure if this completely solved it. We are at frr-10.2.1 and a vanilla kernel 6.6.60 on ubuntu 24.04

majst01 avatar Jan 23 '25 08:01 majst01

I experience this issue as well.

It appears when a neighbor is created using neighbor fe80::1 ... it triggers the BGP nexthop table to be pre-populated with an invalid entry. After tcpdump or changing certain configuration, the nexthop table is re-evaluated and an additional entry is added, except this time it is valid.

While debugging, I found the issue goes away if you ensure any link-local with a real ifIndex is considered valid when nht entries are created.

I do not know what inserts the initial entry. However it might be better to fix there by not triggering to insert one?

I have a branch that contains the topotest bgp_ipv6_ll_peering2 along with the what I mentioned above if that helps, @ton31337?

jvoss avatar Mar 15 '25 02:03 jvoss

Any updates? Did the fixes get merged into the latest version?

factor2431 avatar May 19 '25 08:05 factor2431

@jvoss, do you want to push a PR and see what our CI thinks about that?

ton31337 avatar May 19 '25 08:05 ton31337

@jvoss, do you want to push a PR and see what our CI thinks about that?

My workaround is potentially more of a hack than a fix. The change will basically assume any link-local address is valid in the BNC.

A better solution might be to not insert an invalid entry when a link-local neighbor is defined in the configuration... however I am not sure where this occurs.

jvoss avatar May 29 '25 15:05 jvoss