go-libp2p icon indicating copy to clipboard operation
go-libp2p copied to clipboard

DNS resolution should be moved to the transports

Open marten-seemann opened this issue 3 years ago • 6 comments

We currently resolve multiaddrs in the basichost: https://github.com/libp2p/go-libp2p/blob/d7ba37217c6f1a0d739666f78ac19a87728a7280/p2p/host/basic/basic_host.go#L712

This leads to problems with transports that, for example, validate TLS certificates, as they're kept ignorant about the domain name, and only learn the resolved address. This applies to WebSockets (thanks to @aschmahmann for bringing it up in #1592) and will apply to WebTransport as well (although we will mostly use certificate hashes there).

Architecturally, address resolution should probably live within the transport. This has a few advantages:

  • it gives full control what to do with a multiaddr to that transport
  • it makes address resolution part of the dial timeout
  • it speeds up things, since we don't need to resolve all the peer's addresses before we start dialing
  • it parallelises address resolution (as dials are run concurrently)
  • it moves logic out of the basic host

However, there are caching implications. Currently, we implement some rudimentary caching for a const 2 minutes, not based on the TTL of the DNS record: https://github.com/libp2p/go-libp2p/blob/d7ba37217c6f1a0d739666f78ac19a87728a7280/p2p/host/basic/basic_host.go#L707 Arguably, caching should live inside the DNS resolver, so one could argue that this is safe to remove.

@Stebalien @vyzo @aschmahmann @MarcoPolo Thoughts?

marten-seemann avatar Jun 11 '22 12:06 marten-seemann

I am not convinced its the transport's responsibility to do dns resolution; and we can pass the domain in a contrext option for websockets.

For caching, we should probably rely on the resolver, might make sense to make a caching layer for the vanilla/OS resolver as the DoH resolver already does it.

vyzo avatar Jun 11 '22 16:06 vyzo

This leads to problems with transports that, for example, validate TLS certificates, as they're kept ignorant about the domain name, and only learn the resolved address

IIUC this isn't technically what happens instead we get both /dns4/foo.tld/tcp/443/wss/p2p/QmFoo and /ip4/1.2.3.4/tcp/443/wss/p2p/QmFoo in the peerstore and then are stuck with the decision of dialing both (whether sequentially or in parallel) or choosing one.

At the moment WSS chooses the worst option by only choosing the IP one due to the filtering here https://github.com/libp2p/go-libp2p/blob/d7ba37217c6f1a0d739666f78ac19a87728a7280/p2p/transport/websocket/websocket.go#L26

I am not convinced its the transport's responsibility to do dns resolution; and we can pass the domain in a contrext option for websockets.

IIUC the issue above is not so much that we're missing the DNS ID when dialing websockets its that we've effectively thrown away information by resolving DNS -> IP and keeping them as two equally valid addresses, when in reality one of them derives from the other. This means we're stuck trying to figure out which of /dns4/foo.tld/tcp/443/wss/p2p/QmFoo and /ip4/1.2.3.4/tcp/443/wss/p2p/QmFoo is the "real" address.

A couple ways I can see approaching this:

  1. Only keep around top-level addresses (i.e. only resolve /p2p/QmFoo but no other address pieces) in the set of addresses to dial coming from the peerstore. Then let transports do resolution or ask the host to do resolution if needed.
  2. Keep the top-level addresses in the peerstore as the set of peers to dial, but also pass around a list of "derived addresses" that the transport can also choose to make use of

I suspect 1 is easier to do, but either seem plausible.

aschmahmann avatar Jun 13 '22 14:06 aschmahmann

It seems to me that a transport should encapsulate all the logic around (at least a part of) the multiaddr. It seems weird that the host knows how to resolve a dns name to an ip address, shouldn't this be fully the transport's responsibility?

I think I agree with @marten-seemann here. As for caching, isn't the OS doing this already? Have we seen a need to handle this caching ourselves?

@aschmahmann

Only keep around top-level addresses (i.e. only resolve /p2p/QmFoo but no other address pieces) in the set of addresses to dial coming from the peerstore. Then let transports do resolution or ask the host to do resolution if needed.

I'm not sure I follow, do you mean only keep the peer id in the peer store? Or only keep the unresolved /dns4/... address?

MarcoPolo avatar Jun 14 '22 23:06 MarcoPolo

As for caching, isn't the OS doing this already? Have we seen a need to handle this caching ourselves?

Ah we support custom dns resolvers. Okay I agree the cache should live with these resolvers.

MarcoPolo avatar Jun 15 '22 00:06 MarcoPolo

I'm not sure I follow, do you mean only keep the peer id in the peer store? Or only keep the unresolved /dns4/... address?

I meant keeping the addresses that libp2p learns about either from the caller calling something like host.Connect() or learned via some other protocol like identify. So in this case the /dns4/....

The comment around the peer ID was just to say that if the caller does something like host.Connect on /p2p/PeerID which causes some resolver (e.g. the IPFS Public DHT) to do a lookup and learn about /dns4/... we should count that one too and not just say the /p2p/PeerID was the top level and /dns4/... was a resolution of that address.

aschmahmann avatar Jun 15 '22 00:06 aschmahmann

To clarify, this proposal only applies to /dnsX addresses. The host still needs to resolve /dnsaddrs. Ideally, we find a way to that concurrently with other dials.

marten-seemann avatar Jun 20 '22 19:06 marten-seemann

Just started to play around with this, and ran into the following issue: madns.Resolver.Resolve returns a []ma.Multiaddr. This is correct, a single DNS name can resolve to multiple IP addresses.

This however prevents us from just doing a call to Resolve in transport.Dial. Dial is expected to dial a single connection, and shouldn't have to deal with concurrent dials (that logic belongs in the swarm).

marten-seemann avatar Aug 22 '22 12:08 marten-seemann