rust-libp2p
rust-libp2p copied to clipboard
Add `dns` and `relay::v2::client::transport::ClientTransport` to relay_v2 example
Here I'm creating a reproducible example for #2503
Could you test with the patch below?
diff --git a/protocols/relay/examples/relay_v2.rs b/protocols/relay/examples/relay_v2.rs
index 0e3fde7c..2794405b 100644
--- a/protocols/relay/examples/relay_v2.rs
+++ b/protocols/relay/examples/relay_v2.rs
@@ -57,8 +57,8 @@ fn main() -> Result<(), Box<dyn Error>> {
let tcp = tcp::TcpConfig::new().nodelay(true).port_reuse(true);
let dns_tcp = block_on(dns::DnsConfig::system(tcp))?;
- let transport = dns_tcp
- .or_transport(relay_transport)
+ let transport = relay_transport
+ .or_transport(dns_tcp)
Could you test with the patch below?
diff --git a/protocols/relay/examples/relay_v2.rs b/protocols/relay/examples/relay_v2.rs index 0e3fde7c..2794405b 100644 --- a/protocols/relay/examples/relay_v2.rs +++ b/protocols/relay/examples/relay_v2.rs @@ -57,8 +57,8 @@ fn main() -> Result<(), Box<dyn Error>> { let tcp = tcp::TcpConfig::new().nodelay(true).port_reuse(true); let dns_tcp = block_on(dns::DnsConfig::system(tcp))?; - let transport = dns_tcp - .or_transport(relay_transport) + let transport = relay_transport + .or_transport(dns_tcp)
That works @mxinden! Is this an intended behaviour?
Thanks for debugging this @demfabris!
Just for the sake of completeness, this is what is happening:
- With
OrTransport(relay_transport, dns_tcp_transport)(The error case)OrTransportforwards the dial todns_transportfirstdns_transportresolves the DNS address to an IP address and forwards the dial totcp_transporttcp_transportsees thep2p-circuitand errors out withMultiaddrNotSupported
- With
OrTransport(dns_tcp_transport, relay_transport)(The solution)OrTransportforwards the dial torelay_transportfirstrelay_transportdetects theMultiaddrto contain ap2p-circuit- ... everything goes as expected.
Is this an intended behaviour?
Let's say I would prefer to not have this footgun. A way I see to prevent users from running into this: Take the other Transport implementation (in our case dns_transport) as an argument to Client::new_transport_and_behaviour and combine the relay_transport and the dns_transport via OrTransport in the right order within the new_transport_and_behaviour method, returning the combined `OrTransport.
What do you think @demfabris?
Why can’t the OrTransport try with the second option after having seen the error from the first?
Hmm I see @mxinden. So you're suggesting we feed an optional transport arg to Client::new_transport_and_behaviour.
I'm a bit worried about this function signature becoming confusing. What you have in mind?
Also, since we are proposing a fix specifically to relay module, how confident are you about other possible problematic transport combinations?
Why can’t the OrTransport try with the second option after having seen the error from the first?
Each Transport::dial implementation can be thought of as two phases.
-
Synchronous validation of the provided
Multiaddr. E.g. in the case oflibp2p-tcpit validates theMultiaddrviamultiaddr_to_socketaddr. https://github.com/libp2p/rust-libp2p/blob/861e15dabbba28ce5aff3c6332ab10ce9c4c24b8/transports/tcp/src/lib.rs#L396-L403 Once phase one is successful, eachTransport::dialimplementation returns aFuture, e.g. in the case of TCP it returns thedo_dialfuture: https://github.com/libp2p/rust-libp2p/blob/861e15dabbba28ce5aff3c6332ab10ce9c4c24b8/transports/tcp/src/lib.rs#L350-L369 -
Asynchronous polling of returned
Futureexecutes the actual dialing logic, e.g. in the case oflibp2p-tcpit waits for the socket to become writable: https://github.com/libp2p/rust-libp2p/blob/861e15dabbba28ce5aff3c6332ab10ce9c4c24b8/transports/tcp/src/provider/async_io.rs#L51-L52
The fail-over mechanism of the OrTransport (currently) only happens on failure of phase 1, but not on failure of phase 2.
https://github.com/libp2p/rust-libp2p/blob/861e15dabbba28ce5aff3c6332ab10ce9c4c24b8/core/src/transport/choice.rs#L66-L84
In order to fail over after a failure of the first Transport in its phase 2, the Future returned by OrTransport::dial would need to keep a copy of the second Transport in order to be able to execute a second dial once the first failed. That would require the second Transport to be Clone.
@rkuhn does the above make sense?
I am not trying to say, that this should not be done, I am only laying out the status quo.
Also, since we are proposing a fix specifically to
relaymodule, how confident are you about other possible problematic transport combinations?
@demfabris I am not aware of any other problematic transport combinations. Am I missing one? In case there is none, the above suggestion of extending new_transport_and_behaviour would be a valid fix and not so much a hack.
I'm a bit worried about this function signature becoming confusing. What you have in mind?
What do you think of the diff below?
diff --git a/protocols/relay/src/v2/client.rs b/protocols/relay/src/v2/client.rs
index fa4447df..cd1f3aba 100644
--- a/protocols/relay/src/v2/client.rs
+++ b/protocols/relay/src/v2/client.rs
@@ -101,17 +101,18 @@ pub struct Client {
}
impl Client {
- pub fn new_transport_and_behaviour(
+ pub fn new_transport_and_behaviour<T>(
local_peer_id: PeerId,
+ transport: T,
) -> (transport::ClientTransport, Self) {
- let (transport, from_transport) = transport::ClientTransport::new();
+ let (relay_transport, from_transport) = transport::ClientTransport::new();
let behaviour = Client {
local_peer_id,
from_transport,
directly_connected_peers: Default::default(),
queued_actions: Default::default(),
};
- (transport, behaviour)
+ (OrTransport(relay_transport, transport), behaviour)
}
}
Due to another reason, putting the relay_transport first is documented below. Though I am sure most folks miss this small comment, thus whatever we do, I think we should enforce this at the type system level.
https://github.com/libp2p/rust-libp2p/blob/96dbfcd1ade6de5b4ae623f59fb0d67b2916576a/protocols/relay/src/v2/client/transport.rs#L110-L111
Sounds alright to me @mxinden. Let's move forward with the fix 👍🏻
Hey @mxinden I was about to push this changes but something got me thinking:
Your suggested diff would cause it to be required to have an existing transport to construct a v2::Client, even tho I don't see the use case for a single v2::Client transport to be operating alone, do you think this is something we want?
What you think about we make two construct methods, one called from_transport... and the other remain as you suggested?
Your suggested diff would cause it to be required to have an existing transport to construct a
v2::Client, even tho I don't see the use case for a singlev2::Clienttransport to be operating alone, do you think this is something we want?
I don't think one would ever construct a v2::Client without an additional Transport. After all, the relaying has to happen over some Transport. Let me know in case I am missing something.
I totally agree, I'm asking this because one of the tests is implemented with v2::Client alone and it got me wondering
I totally agree, I'm asking this because one of the tests is implemented with
v2::Clientalone and it got me wondering
Oh, mind pointing me to it @demfabris?
I think this section confused me earlier https://github.com/libp2p/rust-libp2p/blob/master/protocols/relay/tests/v2.rs#L320 but now I see that's not the case 😆. Sorry I must've read this on a hurry
This pull request has merge conflicts. Could you please resolve them @demfabris? 🙏
Closing as stale.