Backpressure between components
What is backpressure
A slow consumer should slow down (i.e. backpressure) a fast producer.
Why do we need backpressure
- Prevent unbounded growth of buffers.
- Potential DOS defense.
- Potential lower latencies.
- Potential better resource allocation.
See also coding guidelines - Bound everything.
Where do we enforce backpressure
-
[x] User ->
Swarm- No backpressure
- Useful when dialing a lot of peers via
Swarm::dial. ConnectionLimitenforces boundedness at least in thePool.- Might not be worth fixing, i.e. bursts might be fine.
-
[ ] User ->
NetworkBehaviour- No backpressure
- User can access
NetworkBehaviourviaSwarm::behaviour_mut - Useful e.g. when doing large amounts of DHT lookups via
libp2p-kad, manylibp2p-request-responserequests, ... - Potential solution
- In the case of
libp2p-kadhave aKademlia::poll_find_closest_readythat needs to be polled beforeKademlia::find_closestsimilar toSink::poll_ready.
- In the case of
-
[ ]
Swarm->NetworkBehaviour- No backpressure
- Only system notification events (e.g. connection established), thus not important.
- Inbound connection
- Can only be droppped
- We have static limits via Swarm
- In the future, with generic connection management, this can be dynamic, e.g. based on your system memory https://github.com/libp2p/rust-libp2p/issues/2824
-
[ ]
NetworkBehaviour->ConnectionHandler- No backpressure
Swarmdoes not pollNetworkBehaviourin case it could not deliver previously returned event from theNetworkBehaviourto the destinedConnectionHandler. https://github.com/libp2p/rust-libp2p/blob/f9b4af3d9d5b12b20756e496349b0866baa862da/swarm/src/lib.rs#L1032-L1034NetworkBehaviouris blocked on single slowConnectionHandler- Potential solutions
- Drop the event
NetworkBehaviouralready needs to handle the case where the connection closes and thus the event is never delivered to theConnectionHandler.- When sending two events, first might be dropped (
ConnectionHandlerbusy) while the second might be delivered. Not intuitive.
- Return event back to the
NetworkBehaviour - Add
NetworkBehaviour::poll_connection_handler_event, providing a list ofConnectionIds withConnectionHandlers that are ready to receive another event.
- Drop the event
-
[ ]
NetworkBehaviour->SwarmNetworkBehaviourAction::Dial
-
[x]
ConnectionHandler->NetworkBehaviour- Backpressure
ConnectionHandleris blocked on slowNetworkBehaviour- Expected behaviour
-
[ ]
Connection->ConnectionHandler- [ ]
ConnectionHandler::inject_fully_negotiated_inbound- Inbound can only drop
- how about poll_inbound_ready
- Note that multistream-select needs to run first, otherwise we can't map to a specific ConnectionHandler
- [ ]
ConnectionHandler::inject_fully_negotiated_outbound- how about poll_outbound_substream
- Or the ConnectionHandler needs to internally enforce a limit of pending outbound connections. (Lots of duplication given the amount of ConnectionHandlers)
- [ ]
ConnectionHandler::inject_event- No backpressure
- Possible solution
- Add
ConnectionHandler::poll_inject_event_ready
- Add
- [ ]
-
[ ]
ConnectionHandler->Connection- [ ]
ConnectionHandlerEvent::OutboundSubstreamRequest- Can only be dropped
- [ ]
-
[ ] Stream
- [ ] Creation of outbound streams
- Backpressure is only enforced by QUIC
- Potentially one day by Yamux (https://github.com/libp2p/rust-yamux/issues/150)
- Potentially one day with https://github.com/libp2p/specs/pull/394
- For now, we need to drop streams
- [x] Bytes on stream
- Fine with Yamux, QUIC and potentially one day qmux
- No backpressure on streams with mplex https://github.com/libp2p/specs/pull/402
- [ ] Creation of outbound streams
Related resources
- https://github.com/libp2p/rust-libp2p/pull/1602
- https://github.com/libp2p/rust-libp2p/issues/3041
- https://github.com/libp2p/rust-libp2p/blob/master/docs/coding-guidelines.md
Relevant discussion happening here: https://github.com/libp2p/rust-libp2p/discussions/3411.
Cross-referencing backpressure tracking issue for Kademlia here https://github.com/libp2p/rust-libp2p/issues/3710.
Cross-referencing a discussion around backpressure: https://github.com/libp2p/rust-libp2p/discussions/4585.
What is the state of this?
What is the state of this?
Hey @sirandreww-starkware. I dont believe anybody is working on this at this time if you want to add to any suggestions or wish give it a shot :).
What is the state of this?
Hi @sirandreww-starkware which part are you asking specifically about?
@jxs I'm mainly seeking back-pressure in gossipsub, I'm stress-testing it to see the maximum throughput I can have when one peer is broadcasting on a topic and the others are listening, It seemed to me like gossipsub does not have back-pressure implemented (#6117) although I see this is answered now and might potentially be an issue on our end.