frr icon indicating copy to clipboard operation
frr copied to clipboard

RIPng not converging when interface stays up

Open asdfjkluiop opened this issue 3 years ago • 2 comments

Describe the bug When running ripng with multiple paths to the same address a failure of one upstream router which does not result in a local interface going down causes refreshes of the route even though the announcements are coming from a different interface. For example, if I have interface A and interface B both which announce routes to 2001:db8::cafe:babe/128, if the router attached to interface A goes down WITHOUT interface A itself entering the down state the announcements from router B will continue to refresh the route indefinitely causing convergence to fail. This can happen when running rip over fou links or VPNs. This can also happen if ripngd dies on one router while the router itself remains up.

[x] Did you check if this is a duplicate issue? [ ] Did you test it on the latest FRRouting/frr master branch?

To Reproduce

  1. Setup 3 routers with ripng, have the 2 upstream routers announce routes for the same prefix over a virtual link like fou/IPv6 tunnels/VPN etc. An example topology looks like A ----> C <---- B. Routers A and B are NOT connected to each other.
  2. Bring down the up stream router that currently has the route by stopping ripngd, deleting the interface/stopping the VPN, etc. The link on the downstream router should stay up as it's not a physical link
  3. Check the rip routing table with show ipv6 ripng and watch as the route refreshes based on announcements from the other router coming from an unrelated interface

Expected behavior Given the announcements are no longer coming from the next hop interface for that route the route should not be refreshed and should be left to timeout at which point an announcement from another router will replace it.

Versions OS Version: OpenWRT 21.02.1 Kernel: 5.16.3 FRR: 7.5

asdfjkluiop avatar Feb 11 '22 06:02 asdfjkluiop

I'm using the Docker Image frrouting/frr:v8.0.1 and running into the same issue.

dereddy93 avatar Feb 15 '22 15:02 dereddy93

Hello, I have a similar issue with same interface receiving anycast and lost one of the announcers.

I have a vpn hub and spoke with anycast routing, if any spoke fails, route keeps here...

FRR: 7.5.1

  HUB

Spoke Spoke


Old Route in table

ip -6 r show | grep 3e91
aa00:aaaa:6162:6370:726f:3e91:0:1 via fe80::1111:c1ff:feaf:93cb dev gpn proto ripng metric 20 pref medium

Debug Packet and its including correct nexthop

tcpdump -ni gpn port 521 | grep :3e91
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on gvpn, link-type EN10MB (Ethernet), capture size 262144 bytes
11:27:08.730627 IP6 fe80::a808:abff:fea0:36ba.521 > fe80::4895:1ff:fe24:ff95.521:  ripng-resp 2: aa00:aaaa:6162:6370:726f:3e91:0:1/128 (1) aa00:aaaa:6162:6370:726f:3e92:0:1/128 (1)
11:27:11.603522 IP6 fe80::a808:abff:fea0:36ba.521 > ff02::9.521:  ripng-resp 2: aa00:aaaa:6162:6370:726f:3e91:0:1/128 (1)aa00:aaaa:6162:6370:726f:3e92:0:1/128 (1)
11:27:17.604994 IP6 fe80::a808:abff:fea0:36ba.521 > ff02::9.521:  ripng-resp 2: aa00:aaaa:6162:6370:726f:3e91:0:1/128 (1) aa00:aaaa:6162:6370:726f:3e92:0:1/128 (1)

Restart frr

/etc/init.d/frr restart
 * Stopped watchfrr
 * Stopped ripngd
 * Stopped zebra
 * Stopped staticd
 * Started watchfrr

Everything works


ip -6 r show | grep 3e91
aa00:aaaa:6162:6370:726f:3e91:0:1 via fe80::a808:abff:fea0:36ba dev gpn proto ripng metric 20 pref medium


ping aa00:aaaa:6162:6370:726f:3e91:0:1
PING aa00:aaaa:6162:6370:726f:3e91:0:1(aa00:aaaa:6162:6370:726f:3e91:0:1) 56 data bytes
64 bytes from aa00:aaaa::6162:6370:726f:3e91:0:1: icmp_seq=1 ttl=64 time=10.7 ms
64 bytes from aa00:aaaa:6162:6370:726f:3e91:0:1: icmp_seq=2 ttl=64 time=11.4 ms

Event Logs Unexpected

I see those event logs but I dont know if it is related or can be a resolvable cause 
`2022/07/27 10:46:06 RIPNG: ripng join on gpn EADDRINUSE (ignoring)`

victorrodriguez1984 avatar Jul 27 '22 11:07 victorrodriguez1984

I have realized of interface change or update need to fit Garbage timers to lost router and learn in a new path once route was lost. If it is not garbage timer expired, route keeps there...we need to lost the advertisement in that timer and cause outage in the convergence, otherwise it does not work. In case of interfaces Active-Active maybe we should think about Architecture with RIP behaviour.

"If during the garbage collection period a new RIP Response for the route is received, then as you might expect the deletion process is aborted: the Garbage-Collection timer is cleared, the route is marked as valid again, and a new Timeout timer starts"

RFC said different thing Until the garbage-collection timer expires, the route is included in all updates sent by this router. When the garbage-collection timer expires, the route is deleted from the routing table.

Should a new route to this network be established while the garbage- collection timer is running, the new route will replace the one that is about to be deleted. In this case the garbage-collection timer must be cleared.

victorrodriguez1984 avatar Sep 22 '22 09:09 victorrodriguez1984

This issue is stale because it has been open 180 days with no activity. Comment or remove the autoclose label in order to avoid having this issue closed.

github-actions[bot] avatar Mar 22 '23 01:03 github-actions[bot]

Guess I need to bump this then

asdfjkluiop avatar Mar 22 '23 03:03 asdfjkluiop