rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

Simplify `ConnectionHandler` trait by removing as many associated types as possible

Open thomaseizinger opened this issue 3 years ago • 14 comments

Simplify the ConnectionHandler interface. This is one of the first interfaces users get in touch with when the implement their own protocols. Making it as simple as possible without removing functionality should be the goal. All but {To,From}Behaviour should be removed and we should directly hand the negotiated stream to the ConnectionHandler without any form of "upgrade".

### Sub-epics
- [ ] https://github.com/libp2p/rust-libp2p/issues/4306
- [ ] https://github.com/libp2p/rust-libp2p/issues/3268
- [ ] https://github.com/libp2p/rust-libp2p/discussions/4585
- [ ] https://github.com/libp2p/rust-libp2p/issues/3591
### Use `ReadyUpgrade` everywhere
- [ ] https://github.com/libp2p/rust-libp2p/issues/4075
- [ ] https://github.com/libp2p/rust-libp2p/pull/4563
- [ ] https://github.com/libp2p/rust-libp2p/pull/3914
- [ ] https://github.com/libp2p/rust-libp2p/issues/4649
- [ ] refactor(dcutr): use `ReadyUpgrade` for `{In,Out}boundUpgrade`
- [ ] refactor(gossipsub): use `ReadyUpgrade` for `{In,Out}boundUpgrade`
### Migration plan
- [ ] https://github.com/libp2p/rust-libp2p/issues/4307
- [ ] https://github.com/libp2p/rust-libp2p/issues/4521
- [ ] https://github.com/libp2p/rust-libp2p/issues/4697
- [ ] https://github.com/libp2p/rust-libp2p/issues/4790

thomaseizinger avatar Sep 01 '22 13:09 thomaseizinger

I did a bit more exploration and we will need to change a few more things to make this possible.

Concretely, trying to apply ReadyUpgrade (https://github.com/libp2p/rust-libp2p/pull/2855) in as many places as possible will help us with this change. Protocols like identify currently use the upgrade mechanism to directly read a message from the stream and close it afterwards. This is convenient because it means the handler is directly given the event / response without having to manually poll an OptionFuture or FuturesUnordered.

That being said, many protocols do much more complicated things within those upgrade traits (looking at you kademlia and relay) and I think it would overall be more consistent if we just move all of that into the ConnectionHandler.

Potentially, this could be mitigated again by something like https://github.com/libp2p/rust-libp2p/pull/2852 but that abstraction is yet to proof itself.

thomaseizinger avatar Sep 02 '22 09:09 thomaseizinger

That being said, many protocols do much more complicated things within those upgrade traits (looking at you kademlia and relay) and I think it would overall be more consistent if we just move all of that into the ConnectionHandler.

Relay seems to be by far the most complicated one here so any idea that we come up with should be stress-tested against that one.

thomaseizinger avatar Sep 02 '22 09:09 thomaseizinger

Some more thoughts/analysis on this:

Type safety

At the moment, the InboundUpgrade and OutboundUpgrade abstraction provide a form of type-safety. They tie the protocol identifier together with the logic on what should happen with every new inbound and outbound stream. Hardcoding ConnectionHandler to receive "bare" substreams would remove this type-safety.

Realistically, I think type-safety would not suffer much. I'd claim it is equally easy to mix up what to put in InboundUpgrade & OutboundUpgrade vs inject_fully_negotiated_inbound and inject_fully_negotiated_outbound. On the ConnectionHandler level, we again have type-safety because the handler couples the protocol identifier with the logic of how substreams are handled.

Timeouts

rust-libp2p currently offers a timeout for substream upgrades. This timeout would be gone.

Generally, I think every IO operation should have a timeout and I think one design aspect of InboundUpgrade and OutboundUpgrade was to make it easy for users to have a timeout applied to their stream operations. Unfortunately, the abstraction doesn't seem to be quite right here as we can see from the identify example.

Even within our own protocols, we do not make consistent use of this timeout because it is sometimes not possible. For example, the identify protocol needs to observe the remote's address. This information is only available on the behaviour level. For incoming substreams, we thus pass the actual substream all the way to the behaviour and end up polling the future there.

Some ideas on how we could provide this functionality to users are:

  • A wrapper around FuturesUnordered that wraps each item in a timeout: This would be a fairly generic building block that may be applicable in various situations. ConnectionHandlers in particular could use this to enforce a timeout on "protocol futures"[^1].
  • A AsyncRead & AsyncWrite wrapper around NegotiatedSubstream that races each read/write call with a timeout.

Back-pressure on inbound streams

Across rust-libp2p, we take great care (but don't always succeed) in making sure that back-pressure is properly enforced throughout the system to avoid unbounded memory and CPU growth. For inbound substreams, we limit the maximum number of negotiating substreams in HandlerWrapper:

https://github.com/libp2p/rust-libp2p/blob/89f898c69f6e58944e746565c3f8f8a3bbf3324e/swarm/src/connection/handler_wrapper.rs#L258-L267

If we were to remove {In,Out}boundProtocol, we would remove this negotiation step and together with it, the ability to limit the number of concurrently negotiating, inbound substreams.

In reality, this limit is already too easily by-passed without the user noticing by not performing all/any IO in the {In,Out}boundUpgrade traits. Handing out the stream to the ConnectionHandler "leaks" the stream and makes the upgrade process a no-op which immediately creates a slot for a new inbound stream in HandlerWrapper.

[^1]: Protocol futures being futures that perform async reads and writes on a substream and finish with an event / message eventually.

thomaseizinger avatar Sep 02 '22 15:09 thomaseizinger

Type Safety

Realistically, I think type-safety would not suffer much. I'd claim it is equally easy to mix up what to put in InboundUpgrade & OutboundUpgrade vs inject_fully_negotiated_inbound and inject_fully_negotiated_outbound.

Agreed.

  • For example, the identify protocol needs to observe the remote's address. This information is only available on the behaviour level.

If I am not mistaken it is passed to the behaviour as the handler does not know the local information, not the remote.

  • For incoming substreams, we thus pass the actual substream all the way to the behaviour and end up polling the future there.

As an aside, I think we should change that. I don't think we should do I/O on the main event loop.

Timeouts

Some ideas on how we could provide this functionality to users are:

Agreed, that we should provide this timeouts through an alternative mechanism.

In reality, this limit is already too easily by-passed without the user noticing by not performing all/any IO in the {In,Out}boundUpgrade traits. Handing out the stream to the ConnectionHandler "leaks" the stream and makes the upgrade process a no-op which immediately creates a slot for a new inbound stream in HandlerWrapper.

Back-pressure on inbound streams

Agreed. I don't think the current mechanism is solid, and I do agree that it is mostly being bypassed anyways.


[^1]

[^1]: Neat. Didn't know GitHub supports footnotes.

mxinden avatar Sep 06 '22 13:09 mxinden

  • For incoming substreams, we thus pass the actual substream all the way to the behaviour and end up polling the future there.

As an aside, I think we should change that. I don't think we should do I/O on the main event loop.

Tracked here: https://github.com/libp2p/rust-libp2p/issues/2885

mxinden avatar Sep 12 '22 04:09 mxinden

To progress on this front (other than something like #2878), I am thinking of going with a FuturesUnordered wrapper that:

  • Wraps the future in a timeout
  • Never returns None (i.e. hides this in the type interface)

The idea would be that this component is general enough so implementations of ConnectionHandlers can use it for any kind of async processing with substream upgrades only being one of them.

thomaseizinger avatar Oct 19 '22 21:10 thomaseizinger

  • Wraps the future in a timeout

  • Never returns None (i.e. hides this in the type interface)

If I may add a wish, first class bounding of the size of the FuturesUnordered would be appreciated. E.g. for libp2p-kad we need to bound the size to 16.

mxinden avatar Oct 21 '22 14:10 mxinden

So I thought a bit more about this. The current abstraction allows us to directly yield a message from the upgrade. That actually takes some boilerplate off the handler and makes the handler simpler to implement.

Except that this isn't quite true :)

You can't implement a ConnectionHandler without implementing InboundUpgrade (and OutboundUpgrade), meaning the net-gain of a simpler handler is offset by having the user learn another abstraction.

Additionally, not all protocols follow the same pattern of reading only a single message from a stream, meaning you often have to also yield the stream together with the message from the upgrade. At that point, you start mixing paradigms in how messages are sent and received on the streams and I'll claim that it would be simpler to always do it one way.

In the long-run, we could consider removing the UpgradeInfo type and have users directly give us a list of protocols that we should listen on. That would remove another layer although that one arguably does provide some type safety so it is not as clear-cut from my perspective.

thomaseizinger avatar Dec 19 '22 23:12 thomaseizinger

To progress on this front (other than something like #2878), I am thinking of going with a FuturesUnordered wrapper that:

  • Wraps the future in a timeout
  • Never returns None (i.e. hides this in the type interface)

The idea would be that this component is general enough so implementations of ConnectionHandlers can use it for any kind of async processing with substream upgrades only being one of them.

Implementation suggestion:

  • Create a type TimedFutures
  • Only public API is new, add and poll
  • Needs to have an internal waker for when the list is empty
  • Needs to be bounded in size (use const generics?)
  • Can use futures-timer
  • Submitted futures are composed with a module-private future that first checks the timeout, then polls the inner task

Once we have this type, we can:

  • migrate all current uses of {In,Out}boundUpgrade within protocols/ to use ReadyUpgrade
  • push what used to be in the {In,Out}boundUpgrade implementation as a future into TimedFutures
  • poll TimedFutures as part of ConnectionHandler::poll

After that, we can change the definition of ConnectionHandler to no longer require the {In,Out}boundUpgrade traits and always directly pass a NegotiatedSubstream.

thomaseizinger avatar Jan 10 '23 14:01 thomaseizinger

The TimedFutures type will allow us to replace the "timeout" feature currently provided by the upgrade mechanism. @mxinden seems to agree that we won't be losing much type-safety.

Hence, the only feature that is currently provided and will be removed is the "backpressure" on inbound streams. It isn't really backpressure because it only works if the stream is entirely consumed within the {In,Out}boundUpgrade traits. Once we hand it out to the ConnectionHandler, the limit no longer applies.

@libp2p/rust-libp2p-maintainers Do you agree that we can remove this feature for now? It is already tracked in https://github.com/libp2p/rust-libp2p/issues/3078. I am of the opinion that pushing for this issue (no upgrades in connection handlers) gives us a cleaner plate to work on and should make it easier to implement backpressure in a consistent way.

thomaseizinger avatar Jan 11 '23 03:01 thomaseizinger

Hence, the only feature that is currently provided and will be removed is the "backpressure" on inbound streams. It isn't really backpressure because it only works if the stream is entirely consumed within the {In,Out}boundUpgrade traits. Once we hand it out to the ConnectionHandler, the limit no longer applies.

Agreed. I don't think there is much gain from this.

@libp2p/rust-libp2p-maintainers Do you agree that we can remove this feature for now?

Agreed. I am in favor of proceeding here.

The TimedFutures type will allow us to replace the "timeout" feature currently provided by the upgrade mechanism. @mxinden seems to agree that we won't be losing much type-safety.

I don't have an opinion on how to roll this out. The above suggestion (TimedFuture) makes sense to me.

mxinden avatar Jan 18 '23 16:01 mxinden