rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

Backpressure between components

Open mxinden opened this issue 3 years ago • 7 comments

What is backpressure

A slow consumer should slow down (i.e. backpressure) a fast producer.

Why do we need backpressure

  • Prevent unbounded growth of buffers.
  • Potential DOS defense.
  • Potential lower latencies.
  • Potential better resource allocation.

See also coding guidelines - Bound everything.

Where do we enforce backpressure

  • [x] User -> Swarm

    • No backpressure
    • Useful when dialing a lot of peers via Swarm::dial.
    • ConnectionLimit enforces boundedness at least in the Pool.
    • Might not be worth fixing, i.e. bursts might be fine.
  • [ ] User -> NetworkBehaviour

    • No backpressure
    • User can access NetworkBehaviour via Swarm::behaviour_mut
    • Useful e.g. when doing large amounts of DHT lookups via libp2p-kad, many libp2p-request-response requests, ...
    • Potential solution
      • In the case of libp2p-kad have a Kademlia::poll_find_closest_ready that needs to be polled before Kademlia::find_closest similar to Sink::poll_ready.
  • [ ] Swarm -> NetworkBehaviour

    • No backpressure
    • Only system notification events (e.g. connection established), thus not important.
    • Inbound connection
      • Can only be droppped
      • We have static limits via Swarm
      • In the future, with generic connection management, this can be dynamic, e.g. based on your system memory https://github.com/libp2p/rust-libp2p/issues/2824
  • [ ] NetworkBehaviour -> ConnectionHandler

    • No backpressure
    • Swarm does not poll NetworkBehaviour in case it could not deliver previously returned event from the NetworkBehaviour to the destined ConnectionHandler. https://github.com/libp2p/rust-libp2p/blob/f9b4af3d9d5b12b20756e496349b0866baa862da/swarm/src/lib.rs#L1032-L1034
    • NetworkBehaviour is blocked on single slow ConnectionHandler
    • Potential solutions
      • Drop the event
        • NetworkBehaviour already needs to handle the case where the connection closes and thus the event is never delivered to the ConnectionHandler.
        • When sending two events, first might be dropped (ConnectionHandler busy) while the second might be delivered. Not intuitive.
      • Return event back to the NetworkBehaviour
      • Add NetworkBehaviour::poll_connection_handler_event, providing a list of ConnectionIds with ConnectionHandlers that are ready to receive another event.
  • [ ] NetworkBehaviour -> Swarm

    • NetworkBehaviourAction::Dial
  • [x] ConnectionHandler -> NetworkBehaviour

    • Backpressure
    • ConnectionHandler is blocked on slow NetworkBehaviour
    • Expected behaviour
  • [ ] Connection -> ConnectionHandler

    • [ ] ConnectionHandler::inject_fully_negotiated_inbound
      • Inbound can only drop
      • how about poll_inbound_ready
      • Note that multistream-select needs to run first, otherwise we can't map to a specific ConnectionHandler
    • [ ] ConnectionHandler::inject_fully_negotiated_outbound
      • how about poll_outbound_substream
      • Or the ConnectionHandler needs to internally enforce a limit of pending outbound connections. (Lots of duplication given the amount of ConnectionHandlers)
    • [ ] ConnectionHandler::inject_event
      • No backpressure
      • Possible solution
        • Add ConnectionHandler::poll_inject_event_ready
  • [ ] ConnectionHandler -> Connection

    • [ ] ConnectionHandlerEvent::OutboundSubstreamRequest
      • Can only be dropped
  • [ ] Stream

    • [ ] Creation of outbound streams
      • Backpressure is only enforced by QUIC
      • Potentially one day by Yamux (https://github.com/libp2p/rust-yamux/issues/150)
      • Potentially one day with https://github.com/libp2p/specs/pull/394
      • For now, we need to drop streams
    • [x] Bytes on stream
      • Fine with Yamux, QUIC and potentially one day qmux
      • No backpressure on streams with mplex https://github.com/libp2p/specs/pull/402

Related resources

  • https://github.com/libp2p/rust-libp2p/pull/1602
  • https://github.com/libp2p/rust-libp2p/issues/3041
  • https://github.com/libp2p/rust-libp2p/blob/master/docs/coding-guidelines.md

mxinden avatar Nov 03 '22 17:11 mxinden

Relevant discussion happening here: https://github.com/libp2p/rust-libp2p/discussions/3411.

thomaseizinger avatar Feb 01 '23 00:02 thomaseizinger

Cross-referencing backpressure tracking issue for Kademlia here https://github.com/libp2p/rust-libp2p/issues/3710.

mxinden avatar Mar 30 '23 16:03 mxinden

Cross-referencing a discussion around backpressure: https://github.com/libp2p/rust-libp2p/discussions/4585.

thomaseizinger avatar Oct 03 '23 03:10 thomaseizinger

What is the state of this?

sirandreww-starkware avatar Aug 04 '25 11:08 sirandreww-starkware

What is the state of this?

Hey @sirandreww-starkware. I dont believe anybody is working on this at this time if you want to add to any suggestions or wish give it a shot :).

dariusc93 avatar Aug 04 '25 15:08 dariusc93

What is the state of this?

Hi @sirandreww-starkware which part are you asking specifically about?

jxs avatar Aug 09 '25 08:08 jxs

@jxs I'm mainly seeking back-pressure in gossipsub, I'm stress-testing it to see the maximum throughput I can have when one peer is broadcasting on a topic and the others are listening, It seemed to me like gossipsub does not have back-pressure implemented (#6117) although I see this is answered now and might potentially be an issue on our end.

sirandreww-starkware avatar Aug 10 '25 06:08 sirandreww-starkware