frr icon indicating copy to clipboard operation
frr copied to clipboard

Unstable tcp ldp session when using explicit-null

Open Viktort-rf opened this issue 3 years ago • 4 comments


Describe the bug When enabling "label local advertisement explicit-null" on both routers, there are problems with maintaining the tcp ldp session. It drops when the Keep Alive timer expires. And immediately its new approval. iptables is disabled. No Keep Alive messages are visible on the removed dumps. I note that if you turn off "label local advertisement explicit-null" on at least one side, the session is stable.

Put "x" in "[ ]" if you already tried following:

[ ] Did you check if this is a duplicate issue? [ ] Did you test it on the latest FRRouting/frr master branch?

To Reproduce

  1. Log in to the routers.
  2. Run the following commands (on both routers): conf t mpls ldp address-family ipv4 label local advertise explicit-null
  3. We are waiting for the ldp session to be raised.
  4. We wait for the standard 3 minutes. (Default timers)
  5. We see a break and a renegotiation of the ldp session (timers are reset to zero)

Expected behavior Stable maintenance of the tcp ldp session when using the explicit-null label

Versions

  • OS Version: CentOS7
  • Kernel: 5.10.13-1.el7.elrepo.x86_64
  • FRR Version: 7.7-dev_git

Viktort-rf avatar Mar 23 '21 14:03 Viktort-rf

The same behaviour with frr 7.5, 7.5.1

anp135 avatar Mar 29 '21 17:03 anp135

Hi,

I've notice the same behavior for latest 9.2-dev version. Exactly at 3 minutes the LDP session is going down and renewed:

R01

2024-01-19T17:04:11.094434+02:00 R01 ldpd[2140917]: msg[in]: notification: lsr-id 10.100.2.1, status KeepAlive Timer Expired (fatal error)
2024-01-19T17:04:11.094484+02:00 R01 ldpd[2140917]: nbr_fsm: event SESSION CLOSE resulted in action CLOSE SESSION and changing state for lsr-id 10.100.2.1 from OPERATIONAL to PRESENT
2024-01-19T17:04:11.094536+02:00 R01 ldpd[2140917]: session_close: closing session with lsr-id 10.100.2.1
2024-01-19T17:04:14.409014+02:00 R01 ldpd[2140917]: discovery[recv]: iface lan0.3001 lsr-id 10.100.2.1 transport-address 10.100.2.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:14.409084+02:00 R01 ldpd[2140917]: discovery[recv]: iface wan0.3000 lsr-id 10.100.2.1 transport-address 10.100.2.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:14.409173+02:00 R01 ldpd[2140917]: discovery[recv]: iface lan0.3001 lsr-id 10.100.2.1 transport-address fc00:0:0:2::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:14.409233+02:00 R01 ldpd[2140917]: discovery[recv]: iface wan0.3000 lsr-id 10.100.2.1 transport-address fc00:0:0:2::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.890825+02:00 R01 ldpd[2140917]: discovery[send]: iface gre1301 (ipv4) holdtime 15
2024-01-19T17:04:15.890974+02:00 R01 ldpd[2140917]: discovery[send]: iface lan0.3001 (ipv4) holdtime 15
2024-01-19T17:04:15.891128+02:00 R01 ldpd[2140917]: discovery[send]: iface wan0.3000 (ipv4) holdtime 15
2024-01-19T17:04:15.891251+02:00 R01 ldpd[2140917]: discovery[send]: iface gre1301 (ipv6) holdtime 15
2024-01-19T17:04:15.891308+02:00 R01 ldpd[2140917]: discovery[send]: iface lan0.3001 (ipv6) holdtime 15
2024-01-19T17:04:15.891383+02:00 R01 ldpd[2140917]: discovery[send]: iface wan0.3000 (ipv6) holdtime 15
2024-01-19T17:04:15.891438+02:00 R01 ldpd[2140917]: discovery[recv]: iface lan0.3001 lsr-id 10.100.2.1 transport-address 10.100.2.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.891545+02:00 R01 ldpd[2140917]: discovery[recv]: iface lan0.3001 lsr-id 10.100.2.1 transport-address fc00:0:0:2::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.891649+02:00 R01 ldpd[2140917]: discovery[recv]: iface wan0.3000 lsr-id 10.100.2.1 transport-address 10.100.2.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.891729+02:00 R01 ldpd[2140917]: discovery[recv]: iface wan0.3000 lsr-id 10.100.2.1 transport-address fc00:0:0:2::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.891791+02:00 R01 ldpd[2140917]: nbr_fsm: event ADJACENCY MATCHED resulted in action NOTHING and changing state for lsr-id 10.100.2.1 from PRESENT to INITIALIZED
2024-01-19T17:04:16.098875+02:00 R01 ldpd[2140917]: msg[in]: initialization: lsr-id 10.100.2.1
2024-01-19T17:04:16.099001+02:00 R01 ldpd[2140917]: recv_init: lsr-id 10.100.2.1 announced the Dynamic Capability Announcement capability
2024-01-19T17:04:16.099057+02:00 R01 ldpd[2140917]: recv_init: lsr-id 10.100.2.1 announced the Typed Wildcard FEC capability
2024-01-19T17:04:16.099105+02:00 R01 ldpd[2140917]: recv_init: lsr-id 10.100.2.1 announced the Unrecognized Notification capability
2024-01-19T17:04:16.099155+02:00 R01 ldpd[2140917]: nbr_fsm: event INIT RECEIVED resulted in action SEND INIT AND KEEPALIVE and changing state for lsr-id 10.100.2.1 from INITIALIZED to OPENREC
2024-01-19T17:04:16.099203+02:00 R01 ldpd[2140917]: msg[out]: initialization: lsr-id 10.100.2.1
2024-01-19T17:04:16.099978+02:00 R01 ldpd[2140917]: nbr_fsm: event KEEPALIVE RECEIVED resulted in action START NEIGHBOR SESSION and changing state for lsr-id 10.100.2.1 from OPENREC to OPERATIONAL

R02:

2024-01-19T17:04:11.085389+02:00 R02 ldpd[52209]: nbr_ktimeout: lsr-id 10.100.1.1
2024-01-19T17:04:11.085748+02:00 R02 ldpd[52209]: msg[out]: notification: lsr-id 10.100.1.1, status KeepAlive Timer Expired (fatal error)
2024-01-19T17:04:11.085871+02:00 R02 ldpd[52209]: nbr_fsm: event SESSION CLOSE resulted in action CLOSE SESSION and changing state for lsr-id 10.100.1.1 from OPERATIONAL to PRESENT
2024-01-19T17:04:11.086036+02:00 R02 ldpd[52209]: session_close: closing session with lsr-id 10.100.1.1
2024-01-19T17:04:14.410304+02:00 R02 ldpd[52209]: discovery[send]: iface lan0.3001 (ipv4) holdtime 15
2024-01-19T17:04:14.410464+02:00 R02 ldpd[52209]: discovery[send]: iface wan0.3000 (ipv4) holdtime 15
2024-01-19T17:04:14.410573+02:00 R02 ldpd[52209]: discovery[send]: iface lan0.3001 (ipv6) holdtime 15
2024-01-19T17:04:14.410660+02:00 R02 ldpd[52209]: discovery[send]: iface wan0.3000 (ipv6) holdtime 15
2024-01-19T17:04:15.892818+02:00 R02 ldpd[52209]: discovery[recv]: iface wan0.3000 lsr-id 10.100.1.1 transport-address 10.100.1.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.893060+02:00 R02 ldpd[52209]: discovery[send]: iface lan0.3001 (ipv4) holdtime 15
2024-01-19T17:04:15.893171+02:00 R02 ldpd[52209]: discovery[send]: iface wan0.3000 (ipv4) holdtime 15
2024-01-19T17:04:15.893275+02:00 R02 ldpd[52209]: discovery[send]: iface lan0.3001 (ipv6) holdtime 15
2024-01-19T17:04:15.893378+02:00 R02 ldpd[52209]: discovery[send]: iface wan0.3000 (ipv6) holdtime 15
2024-01-19T17:04:15.893491+02:00 R02 ldpd[52209]: discovery[recv]: iface lan0.3001 lsr-id 10.100.1.1 transport-address 10.100.1.1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.893595+02:00 R02 ldpd[52209]: discovery[recv]: iface lan0.3001 lsr-id 10.100.1.1 transport-address fc00:0:0:1::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.893686+02:00 R02 ldpd[52209]: discovery[recv]: iface wan0.3000 lsr-id 10.100.1.1 transport-address fc00:0:0:1::1 holdtime 15 (dual stack TLV present)
2024-01-19T17:04:15.893770+02:00 R02 ldpd[52209]: nbr_fsm: event CONNECTION UP resulted in action SETUP NEIGHBOR CONNECTION and changing state for lsr-id 10.100.1.1 from PRESENT to INITIALIZED
2024-01-19T17:04:15.893879+02:00 R02 ldpd[52209]: msg[out]: initialization: lsr-id 10.100.1.1
2024-01-19T17:04:15.893996+02:00 R02 ldpd[52209]: nbr_fsm: event INIT SENT resulted in action NOTHING and changing state for lsr-id 10.100.1.1 from INITIALIZED to OPENSENT
2024-01-19T17:04:16.100525+02:00 R02 ldpd[52209]: msg[in]: initialization: lsr-id 10.100.1.1
2024-01-19T17:04:16.100730+02:00 R02 ldpd[52209]: recv_init: lsr-id 10.100.1.1 announced the Dynamic Capability Announcement capability
2024-01-19T17:04:16.100861+02:00 R02 ldpd[52209]: recv_init: lsr-id 10.100.1.1 announced the Typed Wildcard FEC capability
2024-01-19T17:04:16.100986+02:00 R02 ldpd[52209]: recv_init: lsr-id 10.100.1.1 announced the Unrecognized Notification capability
2024-01-19T17:04:16.101125+02:00 R02 ldpd[52209]: nbr_fsm: event INIT RECEIVED resulted in action SEND KEEPALIVE and changing state for lsr-id 10.100.1.1 from OPENSENT to OPENREC
2024-01-19T17:04:16.101286+02:00 R02 ldpd[52209]: nbr_fsm: event KEEPALIVE RECEIVED resulted in action START NEIGHBOR SESSION and changing state for lsr-id 10.100.1.1 from OPENREC to OPERATIONAL
2024-01-19T17:04:16.101420+02:00 R02 ldpd[52209]: msg[out]: address: lsr-id 10.100.1.1, address 10.100.100.2

My configs: R01:

mpls ldp
 router-id 10.100.1.1
 dual-stack cisco-interop
 neighbor 10.100.2.1 password XXXXXXXX
 neighbor 10.100.13.1 password XXXXXXXX
 !
 address-family ipv4
  discovery transport-address 10.100.1.1
  label local advertise explicit-null
  !
  interface gre1301
  exit
  !
  interface lan0.3001
  exit
  !
  interface wan0.3000
  exit
  !
 exit-address-family
 !
 address-family ipv6
  discovery transport-address fc00:0:0:1::1
  label local advertise explicit-null
  !
  interface gre1301
  exit
  !
  interface lan0.3001
  exit
  !
  interface wan0.3000
  exit
  !
 exit-address-family
 !
exit

R02:

mpls ldp
 router-id 10.100.2.1
 dual-stack cisco-interop
 neighbor 10.100.1.1 password XXXXXXXX
 !
 address-family ipv4
  discovery transport-address 10.100.2.1
  label local advertise explicit-null
  !
  interface lan0.3001
  exit
  !
  interface wan0.3000
  exit
  !
 exit-address-family
 !
 address-family ipv6
  discovery transport-address fc00:0:0:2::1
  label local advertise explicit-null
  !
  interface lan0.3001
  exit
  !
  interface wan0.3000
  exit
  !
 exit-address-family
 !
exit

There any workarounds to avoid this issue? Should I disable for the moment explicit-null?

EasyNetDev avatar Jan 19 '24 15:01 EasyNetDev

I have met this same issue, maybe known limitation? any help? Thanks! @donaldsharp @qlyoung

anlancs avatar Jun 12 '24 14:06 anlancs

@anlancs The problem is likely that MPLS label processing isn't enabled on your interfaces via sysctl.

You can find instructions on how to enable it here: https://docs.frrouting.org/en/stable-5.0/installation.html#linux-sysctl-settings-and-kernel-modules

rwestphal avatar Jun 15 '24 12:06 rwestphal