gobgp icon indicating copy to clipboard operation
gobgp copied to clipboard

Graceful restart is not working if the speaker's IP address changes

Open vernor1 opened this issue 3 years ago • 6 comments

I enabled the BGP graceful restart and if works fine if the restarting speaker is using the same IP address after a restart. However, if the restarting speaker (having the same BGP Identifier) gets a different IP address, then it’s considered a new neighbor and the paths received in the initial UPDATE after restart are just added to the stale routes:

{"192.168.100.100/32":[
    {"nlri":{"prefix":"192.168.100.100/32"},"age":1634333952,"best":true,"attrs":[{"type":1,"value":2},{"type":2,"as_paths":[]},{"type":3,"nexthop":"192.168.100.1"},{"type":5,"value":100}],"stale":true,"source-id":"192.168.200.1","neighbor-ip":"172.20.0.2"},
    {"nlri":{"prefix":"192.168.100.100/32"},"age":1634334403,"best":false,"attrs":[{"type":1,"value":2},{"type":2,"as_paths":[]},{"type":3,"nexthop":"192.168.100.1"},{"type":5,"value":100}],"stale":false,"source-id":"192.168.200.1","neighbor-ip":"172.20.0.3"}
]}

Both helping and restarting speakers are GoBGP v2.32.0. Why are neighbors identified by their IP address rather than BGP Identifier? RFC-4271 suggests that the unique BGP Identifier is used for identifying neighbors within AS since they can have multiple IP interfaces and their IP addresses are not required to be static. Do I miss some setting or there’s a limitation in the current GoBGP implementation?

vernor1 avatar Oct 15 '21 22:10 vernor1

Can you show how the configuration looks like in your both sides and IP addressing in Linux(?).

ton31337 avatar Oct 25 '21 07:10 ton31337

The IP addressing is simple: both helping and restarting speakers are assigned an IP address in 172.0.0.0/8. My helping speaker (reflector) has the static configuration:

[global.config]
  as = 65000
  router-id = "172.20.0.2"

[[defined-sets.neighbor-sets]]
  neighbor-set-name = "switches"
  neighbor-info-list = ["10.0.0.1", "10.0.0.2"]

[[policy-definitions]]
  name = "drop_to_cds"
  [[policy-definitions.statements]]
    name = "pass_to_switches"
    [policy-definitions.statements.conditions.match-neighbor-set]
      neighbor-set = "switches"
    [policy-definitions.statements.actions]
      route-disposition = "accept-route"
  [[policy-definitions.statements]]
    name = "pass_to_cds"
    [policy-definitions.statements.actions]
      route-disposition = "reject-route"

[[peer-groups]]
  [peer-groups.config]
    peer-group-name = "cdpods"
    peer-as = 65000
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv4-unicast"
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv6-unicast"
  [peer-groups.timers.config]
    hold-time = 300
  [peer-groups.graceful-restart.config]
    enabled = true

[[dynamic-neighbors]]
  [dynamic-neighbors.config]
    prefix = "172.0.0.0/8"
    peer-group = "cdpods"

[[peer-groups]]
  [peer-groups.config]
    peer-group-name = "switches"
    peer-as = 65000
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv4-unicast"
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv6-unicast"
  [peer-groups.route-reflector.config]
    route-reflector-client = true
    route-reflector-cluster-id = "172.20.0.2"

[[neighbors]]
  [neighbors.config]
    neighbor-address = "10.0.0.1"
    peer-group = "switches"

[[neighbors]]
  [neighbors.config]
    neighbor-address = "10.0.0.2"
    peer-group = "switches"

The restarting speaker runs without a config file, it's configured in runtime as described in the next comment.

vernor1 avatar Nov 02 '21 19:11 vernor1

Initial startup of the restarting speaker:

# Start speaker
api = gobgp_pb2_grpc.GobgpApiStub(grpc.insecure_channel(f'{host_ip}:{port}'))
global_params = {'as': 65000, 'router_id': 192.168.200.1, 'listen_port': -1}
global_config = gobgp_pb2.Global(**global_params)
request_params = {'global': global_config}
api.StartBgp(gobgp_pb2.StartBgpRequest(**request_params))

# Add peer group
peer_group_conf = gobgp_pb2.PeerGroupConf(peer_as = 65000, peer_group_name = 'reflectors')
timers = gobgp_pb2.Timers(config = gobgp_pb2.TimersConfig(hold_time = 300))
graceful_restart = gobgp_pb2.GracefulRestart(enabled = True, restart_time = 300, local_restarting = False)
afi_safis = [
    gobgp_pb2.AfiSafi(
        mp_graceful_restart = gobgp_pb2.MpGracefulRestart(
            config = gobgp_pb2.MpGracefulRestartConfig(enabled = True)),
        config = gobgp_pb2.AfiSafiConfig(
            family = gobgp_pb2.Family(afi = gobgp_pb2.Family.AFI_IP, safi = gobgp_pb2.Family.SAFI_UNICAST),
            enabled = True))
]
api.AddPeerGroup(
    gobgp_pb2.AddPeerGroupRequest(
        peer_group = gobgp_pb2.PeerGroup(
            conf = peer_group_conf,
            timers = timers,
            graceful_restart = graceful_restart,
            afi_safis = afi_safis)))

# Add peer
api.AddPeer(gobgp_pb2.AddPeerRequest(
    peer = gobgp_pb2.Peer(
        conf = gobgp_pb2.PeerConf(
            neighbor_address = '172.20.0.2',
            peer_group = 'reflectors'),
        graceful_restart = graceful_restart)))

# Add path
nlri_any = Any()
nlri_any.Pack(attribute_pb2.IPAddressPrefix(
    prefix_len = 32,
    prefix = '192.168.100.100'))
origin_any = Any()
origin_any.Pack(attribute_pb2.OriginAttribute(origin = 2))
next_hop_any = Any()
next_hop_any.Pack(attribute_pb2.NextHopAttribute(next_hop = '192.168.100.1'))
api.AddPath(
    gobgp_pb2.AddPathRequest(
        table_type = gobgp_pb2.GLOBAL,
        path = gobgp_pb2.Path(
            nlri = nlri_any,
            pattrs = [origin_any, next_hop_any],
            family = gobgp_pb2.Family(afi = gobgp_pb2.Family.AFI_IP, safi = gobgp_pb2.Family.SAFI_UNICAST))),
    2)

vernor1 avatar Nov 02 '21 19:11 vernor1

Quiet termination of the restarting speaker:

kill -2 $(ps | grep gobgpd | awk '{split($1,a," ");print a[1]}')

vernor1 avatar Nov 02 '21 19:11 vernor1

Restarting on a different IP address:

# Start speaker
api = gobgp_pb2_grpc.GobgpApiStub(grpc.insecure_channel(f'{host_ip}:{port}'))
global_params = {'as': 65000, 'router_id': 192.168.200.1, 'listen_port': -1}
global_config = gobgp_pb2.Global(**global_params)
request_params = {'global': global_config}
api.StartBgp(gobgp_pb2.StartBgpRequest(**request_params))

# Re-add the route
# Add path
nlri_any = Any()
nlri_any.Pack(attribute_pb2.IPAddressPrefix(
    prefix_len = 32,
    prefix = '192.168.100.100'))
origin_any = Any()
origin_any.Pack(attribute_pb2.OriginAttribute(origin = 2))
next_hop_any = Any()
next_hop_any.Pack(attribute_pb2.NextHopAttribute(next_hop = '192.168.100.1'))
api.AddPath(
    gobgp_pb2.AddPathRequest(
        table_type = gobgp_pb2.GLOBAL,
        path = gobgp_pb2.Path(
            nlri = nlri_any,
            pattrs = [origin_any, next_hop_any],
            family = gobgp_pb2.Family(afi = gobgp_pb2.Family.AFI_IP, safi = gobgp_pb2.Family.SAFI_UNICAST))),
    2)

# Re-add peer group with restart flag
peer_group_conf = gobgp_pb2.PeerGroupConf(peer_as = 65000, peer_group_name = 'reflectors')
timers = gobgp_pb2.Timers(config = gobgp_pb2.TimersConfig(hold_time = 300))
graceful_restart = gobgp_pb2.GracefulRestart(
    enabled = True,
    restart_time = 300,
    local_restarting = True)
afi_safis = [
    gobgp_pb2.AfiSafi(
        mp_graceful_restart = gobgp_pb2.MpGracefulRestart(
            config = gobgp_pb2.MpGracefulRestartConfig(enabled = True)),
        config = gobgp_pb2.AfiSafiConfig(
            family = gobgp_pb2.Family(afi = gobgp_pb2.Family.AFI_IP, safi = gobgp_pb2.Family.SAFI_UNICAST),
            enabled = True))
]
api.AddPeerGroup(
    gobgp_pb2.AddPeerGroupRequest(
        peer_group = gobgp_pb2.PeerGroup(
            conf = peer_group_conf,
            timers = timers,
            graceful_restart = graceful_restart,
            afi_safis = afi_safis)))

# Re-add peer with restart flag
api.AddPeer(gobgp_pb2.AddPeerRequest(
    peer = gobgp_pb2.Peer(
        conf = gobgp_pb2.PeerConf(
            neighbor_address = '172.20.0.2',
            peer_group = 'reflectors'),
        graceful_restart = graceful_restart)))

vernor1 avatar Nov 02 '21 19:11 vernor1

@ton31337 Please find a simplistic test code reproducing the issue: https://github.com/vernor1/bgp-graceful-restart All steps are described in README, the only requirement to run the code is Docker Desktop.

vernor1 avatar Nov 02 '21 20:11 vernor1