rust-libp2p transport: remove `address_translation` functionality from `Transport` trait

Description

Currently, the Transport trait exposes an address_translation function that can compute the local listen address from an observed one. This is useful for transports such as TCP where - without port reuse - the port of an outgoing connection is ephemeral. Hence, it would be wrong to hand the observed address to another peer as a possible external address; they will never be able to dial us on this port.

I am suggesting to remove this functionality from the Transport trait and instead put this functionality directly into libp2p::Swarm. At first, this might sound limiting: translation of these addresses is transport specific and users should be able to supply their own transport which might require implementing address translation.

In reality, the set of transports we can perform address translation on is limited by the transports that the multiaddr::Protocol enum supports. Thus, delegating this to the actual Transport implementation creates a false-sense of extensibility.

The mapping algorithm is only specific to the transport technology. In fact, the existing address_translation function only does something meaningful for TCP and QUIC. For TCP, we are conditional on the port_reuse configuration when in fact, also this is redundant. For TCP, we can always replace the observed port with the listen port. If we have port_reuse enabled in the transport, these ports will already be the same and the operation is effectively a no-op. If they are not the same, we will adjust the port to the correct one.

Bear in mind that this does not guarantee that we are actually reachable only this address. This merely generates a candidate (see https://github.com/libp2p/rust-libp2p/pull/3877#issuecomment-1549412143) that we might be reachable on. Actual reachability is subject to be tested with a protocol like AutoNAT.

Motivation

Simplify interfaces and code.

Are you planning to do it yourself in a pull request?

Yes, if we agree that the reasoning is sound.

May 16 '23 11:05 thomaseizinger

@mxinden Do you have any input on this?

May 23 '23 13:05 thomaseizinger

In reality, the set of transports we can perform address translation on is limited by the transports that the multiaddr::Protocol enum supports. Thus, delegating this to the actual Transport implementation creates a false-sense of extensibility.

Apart from removing the "false-sense of extensibility", what problem would it solve to move the address translation to libp2p-swarm?

One could argue that the "false-sense of extensibility" applies to the Transport implementation as a whole. I.e. one can not implement Transport for a new transport protocol for which there is no support in multiaddr::Protocol.

Jun 05 '23 04:06 mxinden

In reality, the set of transports we can perform address translation on is limited by the transports that the multiaddr::Protocol enum supports. Thus, delegating this to the actual Transport implementation creates a false-sense of extensibility.

Apart from removing the "false-sense of extensibility", what problem would it solve to move the address translation to libp2p-swarm?

It reduces the API surface transport implementations need to deal with.
It gives libp2p-swarm more control over the algorithm, which should make it easier to test things as we know what is going on.

One could argue that the "false-sense of extensibility" applies to the Transport implementation as a whole. I.e. one can not implement Transport for a new transport protocol for which there is no support in multiaddr::Protocol.

Yes but it still makes sense to split them into different crates to segregate dependencies from one another. Plus, as we can see in the example of QUIC, one can implement the same transport differently. I'd argue that there is only one correct way to implement address translation and libp2p-swarm should just implement that one as a feature.

Jun 05 '23 06:06 thomaseizinger

If we have port_reuse enabled in the transport, these ports will already be the same

I am not sure this is true. In the case of a NAT in front of the local node the observed port and the listening port will not be the same and thus the former should not be replaced in favor of the latter as is implemented today (see PortReuse::Enabled match arm).

https://github.com/libp2p/rust-libp2p/blob/8b91c89775c1b1134529514c1e6635604ca6d72c/transports/tcp/src/lib.rs#L535-L538

Jun 26 '23 11:06 mxinden

If we have port_reuse enabled in the transport, these ports will already be the same

I am not sure this is true. In the case of a NAT in front of the local node the observed port and the listening port will not be the same and thus the former should not be replaced in favor of the latter as is implemented today (see PortReuse::Enabled match arm).

https://github.com/libp2p/rust-libp2p/blob/8b91c89775c1b1134529514c1e6635604ca6d72c/transports/tcp/src/lib.rs#L535-L538

Hmm. We could detect port-reuse in libp2p-swarm if transports would report a local address for dialed connections, right?

In other words, we could correctly translate observed addresses if outgoing connections had a local address.

Jun 26 '23 16:06 thomaseizinger

So if I understand correctly, the only blocker for this is that Swarm needs to somehow (reliably) learn that libp2p-tcp was configured with port-reuse. I'll shelf this for now but perhaps we can come back to it at some point.

Jun 27 '23 11:06 thomaseizinger

Technically, the setting for port-reuse is per connection right? In theory, a transport could allow for changing this at runtime. Thus, a better model might be if Transport::Dial where to return a type that you can ask whether this connection uses an ephemeral port or not. This is related to https://github.com/libp2p/rust-libp2p/issues/3347.

Additionally, it would also be useful if that returned object would give you access to the local address of the connection. That is a piece of information that we are currently not capturing / exposing.

What do you think of this idea? All connections are either memory, TCP or UDP and the concept of port-reuse applies to all of them. Yes, even to memory because our implementation of MemoryTransport emulates TCP's ephemeral ports for outgoing connections.

We could introduce the following:

pub trait PhysicalConnection {
    fn uses_ephemeral_port(&self) -> bool;
    fn local_addr(&self) -> Multiaddr;
}

And require Transport::Output to implement this for all connections created by Transport::dial. We might have to split Transport::Output into two or also implement the above for all incoming connections. Not sure if that is a good model though because ephemeral ports only make sense for outgoing connections.

Jul 03 '23 13:07 thomaseizinger

@mxinden Currently, we rely on address_translation to give us a correct observed address based on what identify reports.

In the case of port-reuse, this allows for hole-punching to work. Regardless of which port the NAT device assigned to the out-going connection, it will map it back correctly to our local port.

If we don't have port-reuse enabled, is address_translation of any use? Even if the user configured port-forwarding, we have no idea what port they forwarded things to. We might be listening on local port of 9331 and the user opens port 8080 on their router. There is no way we can learn about this port mapping, correct?

Would it be fair to say that doing address_translation on a connection that does not have port-reuse does not make any sense and we should never emit that as a candidate for an external address?

Jul 20 '23 11:07 thomaseizinger

If we don't have port-reuse enabled, is address_translation of any use? Even if the user configured port-forwarding, we have no idea what port they forwarded things to. We might be listening on local port of 9331 and the user opens port 8080 on their router. There is no way we can learn about this port mapping, correct?

In the case of a server with a public interface address_translation is useful. The local node knows on which ports of the public interface it is listening. Though it might not know the IP address of the public interface. It can learn the IP through an outgoing dial to a remote node. It would keep the reported address but address translate the port.

Does that make sense @thomaseizinger?

Jul 31 '23 11:07 mxinden

If we don't have port-reuse enabled, is address_translation of any use? Even if the user configured port-forwarding, we have no idea what port they forwarded things to. We might be listening on local port of 9331 and the user opens port 8080 on their router. There is no way we can learn about this port mapping, correct?

In the case of a server with a public interface address_translation is useful. The local node knows on which ports of the public interface it is listening. Though it might not know the IP address of the public interface. It can learn the IP through an outgoing dial to a remote node. It would keep the reported address but address translate the port.

Does that make sense @thomaseizinger?

It doesn't to me. If the local interface is a public one, we can just query its IP address via the kernel, right? Learning our address by having a remote node tell us what they observed is only useful in the case of a NAT device inbetween us and the remote peer, isn't it?

As soon as we are not in the same subnet as the remote peer, any form of address_translation is at most a good guess because we don't know how the intermediary device allocates ports. I don't know the details about these algorithms but if every outgoing connection uses a different port (i.e. port-reuse is disabled), the NAT device has no way of correlating them and pretty much must also pick a random port.

If on the other hand we have port-reuse enabled and a port forwarding configured in the NAT device, there is at least the possibility that the device will perform a "reverse" port mapping and use the same outgoing port that it would route incoming connections from.

For example, lets say we port-forward 9331 to 8080.

With port-reuse enabled, an outgoing connection will originate from 8080 which can then be mapped to 9331 by the NAT device. In such a scenario, address_translation is useful because it gives us a correct, public address. It is still not guaranteed because the NAT device could just choose any port. Most importantly, for address_translation to work we must not translate the port, only the address.

If port-reuse is disabled, a first outgoing connection might originate from 54321 and a second one from 46342. Translating these to 8080 with the observed address doesn't make any sense because the configured port forwarding is actually 9331.

Address translation in this case only works if the user configured a port-forwarding of 8080 -> 8080.

Jul 31 '23 11:07 thomaseizinger