frr icon indicating copy to clipboard operation
frr copied to clipboard

pimd: Fix for data packet loss when FHR is LHR and RP

Open routingrocks opened this issue 2 years ago • 7 comments

Topology: A single router is acting as the First Hop Router (FHR), Last Hop Router (LHR), and RP.

RC and Issue: When an upstream S,G is in join state, it sends a register message to the RP. If the RP has the receiver, it sends a register stop message and switches to the shortest path. When the register stop message is processed, it removes pimreg, moves to prune, and starts the reg stop timer.

When the reg stop timer expires, PIM changes S,G state to Join Pending and sends out a NULL register message to RP. RP receives it and fails to send Reg stop because SPT is not set at that point.

The problem is when the register stop timer pops and state is in Join Pending. According to https://www.rfc-editor.org/rfc/rfc4601#section-4.4.1, we need to put back the pimreg reg tunnel into the S,G mroute. This causes data to be sent to the control plane and subsequently interrupts the line rate.

Fix: If the router is FHR and RP to the group, ignore SPT status and send out a register stop message back to the DR (in this context, the same router).

Ticket: # Signed-off-by: Donald Sharp [email protected] Signed-off-by: Rajesh Varatharaj [email protected]

routingrocks avatar Aug 17 '23 20:08 routingrocks

CI:rerun rerunning with CI webhook after disabling the CI Checks App

mwinter-osr avatar Aug 17 '23 22:08 mwinter-osr

Continuous Integration Result: FAILED

Test incomplete. See below for issues. CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-13620/

This is a comment from an automated CI system. For questions and feedback in regards to this CI system, please feel free to email Martin Winter - mwinter (at) opensourcerouting.org.

Get source / Pull Request: Successful

Building Stage: Successful

Basic Tests: Incomplete

Addresssanitizer topotests part 4: Incomplete (check logs for details)
Successful on other platforms/tests
  • Addresssanitizer topotests part 3
  • Topotests Ubuntu 18.04 amd64 part 7
  • Topotests debian 10 amd64 part 1
  • Topotests Ubuntu 18.04 i386 part 5
  • Topotests Ubuntu 18.04 i386 part 0
  • Topotests Ubuntu 18.04 amd64 part 4
  • CentOS 7 rpm pkg check
  • Topotests Ubuntu 18.04 i386 part 9
  • Addresssanitizer topotests part 2
  • Topotests Ubuntu 18.04 amd64 part 0
  • Addresssanitizer topotests part 1
  • Topotests Ubuntu 18.04 i386 part 4
  • Topotests debian 10 amd64 part 5
  • Topotests Ubuntu 18.04 amd64 part 1
  • Topotests debian 10 amd64 part 7
  • Debian 9 deb pkg check
  • Addresssanitizer topotests part 8
  • Topotests Ubuntu 18.04 amd64 part 6
  • Topotests Ubuntu 18.04 i386 part 2
  • Topotests Ubuntu 18.04 arm8 part 8
  • Ubuntu 18.04 deb pkg check
  • Topotests debian 10 amd64 part 6
  • Topotests Ubuntu 18.04 arm8 part 6
  • Topotests Ubuntu 18.04 arm8 part 1
  • Topotests Ubuntu 18.04 i386 part 7
  • Addresssanitizer topotests part 6
  • Addresssanitizer topotests part 5
  • Topotests Ubuntu 18.04 arm8 part 3
  • Topotests Ubuntu 18.04 i386 part 3
  • Topotests Ubuntu 18.04 amd64 part 2
  • Topotests Ubuntu 18.04 i386 part 8
  • Topotests debian 10 amd64 part 3
  • Topotests debian 10 amd64 part 8
  • Addresssanitizer topotests part 0
  • Topotests Ubuntu 18.04 arm8 part 4
  • Topotests debian 10 amd64 part 4
  • Topotests debian 10 amd64 part 9
  • Topotests Ubuntu 18.04 arm8 part 9
  • Topotests Ubuntu 18.04 amd64 part 3
  • Topotests Ubuntu 18.04 arm8 part 2
  • Static analyzer (clang)
  • Topotests debian 10 amd64 part 0
  • Addresssanitizer topotests part 9
  • Topotests debian 10 amd64 part 2
  • Topotests Ubuntu 18.04 i386 part 6
  • Topotests Ubuntu 18.04 arm8 part 0
  • Topotests Ubuntu 18.04 arm8 part 7
  • Topotests Ubuntu 18.04 amd64 part 8
  • Topotests Ubuntu 18.04 arm8 part 5
  • Topotests Ubuntu 18.04 i386 part 1
  • Topotests Ubuntu 18.04 amd64 part 9
  • Addresssanitizer topotests part 7
  • Ubuntu 20.04 deb pkg check
  • Debian 10 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 5

NetDEF-CI avatar Aug 18 '23 01:08 NetDEF-CI

ci:rerun

RodrigoMNardi avatar Oct 03 '23 00:10 RodrigoMNardi

This PR is stale because it has been open 180 days with no activity. Comment or remove the autoclose label in order to avoid having this PR closed.

github-actions[bot] avatar Mar 31 '24 01:03 github-actions[bot]

ci:rerun

routingrocks avatar Aug 23 '24 20:08 routingrocks

Can you rebase instead ?

Jafaral avatar Aug 23 '24 20:08 Jafaral

Failure is not related to the change, re-running CI E AssertionError: Testcase test_verify_default_originate_with_2way_ecmp_p2 : After shuting down the interface Convergence is expected to be Failed assert True is not Tru

routingrocks avatar Aug 24 '24 00:08 routingrocks

ci:rerun

routingrocks avatar Dec 17 '24 21:12 routingrocks

ci:rerun

routingrocks avatar Feb 11 '25 02:02 routingrocks

@mergifyio backport dev/10.3 stable/10.2 stable/10.1

Jafaral avatar Feb 20 '25 16:02 Jafaral

backport dev/10.3 stable/10.2 stable/10.1

✅ Backports have been created

mergify[bot] avatar Feb 20 '25 16:02 mergify[bot]