js-libp2p-gossipsub icon indicating copy to clipboard operation
js-libp2p-gossipsub copied to clipboard

Memory leak in streamsOutbound

Open twoeths opened this issue 2 years ago • 2 comments

Description There are items inside streamsOutbound exist with the same size in 2 heap snapshots (one was taken 1 day after another)

Screenshot 2023-12-05 at 13 33 04

there are 55 OutboundStream instances

Screenshot 2023-12-05 at 13 55 24

however there are only 50 items inside Gossipsub peers set

Screenshot 2023-12-05 at 14 00 13

it means that when a peer is disconnected from node, it was removed from Gossipsub peers set but not streamsOutbound map. This is root cause of #6129 and only happen since Lodestar v1.12

twoeths avatar Dec 05 '23 07:12 twoeths

the only way a peer was removed from peers set but not in streamsOutbound is outboundStream?.close() throws error. It was fixed here https://github.com/ChainSafe/js-libp2p-gossipsub/pull/471/files#diff-a2a171449d862fe29692ce031981047d7ab755ae7f84c707aef80701b3ea0c80R855

the fix is included in release 10.1.1 of gossipsub, will try to test it in lodestar

twoeths avatar Dec 05 '23 08:12 twoeths

For reference MAX_OUTBOUND_BUFFER_SIZE = 16MB in Lodestar:

https://github.com/ChainSafe/lodestar/blob/836fabf4ad6e9014484cf988cebea901e3fdddb9/packages/beacon-node/src/network/gossip/gossipsub.ts#L32

Maybe this is a lot, with 50 target peers, so 800 MB of worst case. Outbound streams are only concerned with:

  1. forwarding
  2. publishing

Let's assume we have a full stream that can't take more data. For (1) it's not critical unless it happens extensively. A specific peer not getting a forwarded message sometimes is okay since there should be 7 other peers on average forwarding the same message. For (2) it's more critical since we want to be sure that our message is received timely. However as long we can publish the message to a sufficient number of peers, we are good. Say if 1 out of 50 peers has a full outbound stream it does not matter at all.

dapplion avatar Dec 06 '23 07:12 dapplion

closed as this seems to be fixed

wemeetagain avatar Oct 09 '24 18:10 wemeetagain