litep2p icon indicating copy to clipboard operation
litep2p copied to clipboard

transport_service: Improve connection stability by downgrading connections on substream inactivity

Open lexnv opened this issue 4 months ago • 0 comments

This PR advances the keep-alive timeout of the transport service.

Previously, the keep-alive timeout was triggered 5 seconds after the connection was reported to the transport service regardless of substream activity.

  • (0secs) T0: connection established; keep-alive timeout set to 5seconds in the future
  • (4secs) T1: substream A, B, C opened
  • (5secs) T2: keep-alive timeout triggered and the connection is downgraded. T1 was not taken into account, otherwise, the keep-alive timeout should be triggered at second 9 (T1 at 4 seconds + keepalive 5 seconds)
  • (6secs) T3: substreams A, B, C closed -> connection closes
  • (7secs) T4: cannot open new substreams even if we expected the connection to be kept alive for longer

In this PR:

  • KeepAliveTracker structure to forward the keep-alive timeout of connections.
  • Connection ID is forwarded to SubstreamOpened events to identify properly substream Ids. This is needed because the ConnectionContext contains up to two connections (primary and secondary)

Testing Done

  • test to ensure keepalive downgrades the connection after 5 seconds
  • test to ensure keepalive is forwarded on substream activity
  • test to ensure a downgraded connection with dropped substreams is closed

Closes https://github.com/paritytech/litep2p/issues/253.

lexnv avatar Sep 30 '24 14:09 lexnv