node
node copied to clipboard
net: add autoDetectFamily option
This PR implements the Happy Eyeballs algorithm.
A new option autoDetectFamily
is added to net.connect
.
When enabled and when family is still 0
, the lookup phase will keep all records by setting all=true
.
A connection attempt is tried to all A and AAAA records, keeping only the first successful connection, while others are disconnected.
Errors are raised only if no connection succeeded.
Fixes #41625.
Review requested:
- [ ] @nodejs/net
@nodejs/tsc this is a notable change that we should land before v18 goes LTS. We have seen quite a few reports of bugs due to improper configuration of IPv6, making https://github.com/nodejs/node/issues/41625 a necessity. For example, Docker on Mac only exposes ports on the IPv4 address, making all connections to localhost
fail.
No comment (yet at least) one way or the other on the implementation, but +1000 to implementing happy eyeballs in Node.js.
The two new tests fail for me (Linux x64) and also in GitHub actions.
=== release test-http-happy-eyeballs ===
Path: parallel/test-http-happy-eyeballs
Error: --- stderr ---
node:events:491
throw er; // Unhandled 'error' event
^
Error: listen EADDRINUSE: address already in use :::33167
at Server.setupListenHandle [as _listen2] (node:net:1649:16)
at listenInCluster (node:net:1697:12)
at doListen (node:net:1846:7)
at process.processTicksAndRejections (node:internal/process/task_queues:83:21)
Emitted 'error' event on Server instance at:
at emitErrorNT (node:net:1676:8)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
code: 'EADDRINUSE',
errno: -98,
syscall: 'listen',
address: '::',
port: 33167
}
Node.js v19.0.0-pre
Command: out/Release/node /home/runner/work/node/node/test/parallel/test-http-happy-eyeballs.js
=== release test-net-happy-eyeballs ===
Path: parallel/test-net-happy-eyeballs
Error: --- stderr ---
node:events:491
throw er; // Unhandled 'error' event
^
Error: listen EADDRINUSE: address already in use :::37301
at Server.setupListenHandle [as _listen2] (node:net:1649:16)
at listenInCluster (node:net:1697:12)
at doListen (node:net:1846:7)
at process.processTicksAndRejections (node:internal/process/task_queues:83:21)
Emitted 'error' event on Server instance at:
at emitErrorNT (node:net:1676:8)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
code: 'EADDRINUSE',
errno: -98,
syscall: 'listen',
address: '::',
port: 37301
}
Node.js v19.0.0-pre
Command: out/Release/node /home/runner/work/node/node/test/parallel/test-net-happy-eyeballs.js
===
=== 2 tests failed
===
Acknowledged! I'm taking care of them locally.
No comment (yet at least) one way or the other on the implementation, but +1000 to implementing happy eyeballs in Node.js.
Same goes for me as well.
we should land before v18 goes LTS
@nodejs/tsc ~Two comments.~
- ~If I understand correctly, this is a breaking change. It should be semver major.~ EDIT: Oh, wait, this is an option that is off by default? So people would need to opt in to get the fix? Oh, then I can't imagine this is controversial at all. Yes, let's get this working and landed! Woo hoo! 🎉 🚀
- ~However...I am in favor of landing this in the 18.x line as a special case of an important bug fix. Specifically:~
~Docker on Mac only exposes ports on the IPv4 address, making all connections to
localhost
fail.~
I'm going to rebase and force push to at least get the tarball job working.
Oh, wait, this is an option that is off by default? So people would need to opt in to get the fix?
My case is that we should make it enabled by default. This causes so many headaches to newbie developers that know little of IPv4 vs IPv6 and why localhost is two IPs.
We should make this change before v18 goes LTS.
@mcollina how about we make it the default in a subsequent PR where only that bit can be discussed?
@targos @mcollina
I put the default for each new socket in Socket.autoDetectFamily
(note: a class property that is then reflected to instance options unless the user specifically provides that) and set it to true
.
This was both to make it easy to fix tests and also to make easy for user to make their choice no matter the default we will agree on.
@aduh95 I will add the test you asked for, but I'm not gonna be able to until this Wednesday. So we can choose to land this and I'll send a specific PR later.
@mcollina how about we make it the default in a subsequent PR where only that bit can be discussed?
Agreed!
@mcollina how about we make it the default in a subsequent PR where only that bit can be discussed?
Agreed!
Ok, understood. I'll send an update to this PR shortly.
Can we add test for
dnsPromises.Resolver
as well?
I was willing to, but then I realized I'm not sure why we need a test for this. Could you please clarify?
CI: https://ci.nodejs.org/job/node-test-pull-request/46859/
@nodejs/build I see that the new tests introduce failed due to address already in use under IPv6. I do use {port: 0}
when listening, so I don't see how that it happens.
Am I missing something about the CI environment?
@nodejs/build I see that the new tests introduce failed due to address already in use under IPv6. I do use
{port: 0}
when listening, so I don't see how that it happens.Am I missing something about the CI environment?
The same tests are failing in Actions (https://github.com/nodejs/node/actions/runs/3134711995/jobs/5089627090) so this is not specific to the Jenkins CI environment.
General advice: tests in parallel
will be competing for ports with other tests in parallel. Also maybe you'd need to double check the port is actually 0 (i.e. the code isn't somehow setting it to a non-zero value) when the bind is attempted.
Ok, I see, makes sense! Tomorrow I will update the test to retry the operation in case of conflicts.
I rebased and force-pushed to try to fix the CI issues.
CI: https://ci.nodejs.org/job/node-test-pull-request/46909/
CI: https://ci.nodejs.org/job/node-test-pull-request/46936/
@tniessen I didn't know a spec exists, TBH. I'll read it and implement properly. Especially the delay thing, which will make the test changes unneeded. Thanks for pointing me there. I'll update this in the next few days.
In terms of the question to the TSC on wether we'd be ok with enbling by default (assuming we get it right and have enough baking time) my main questions is if there are cases where people would be expecting to get a specific type of connection (IPV4 or IPV6) and enabling this by default would break that.
One that I can think of is that you have set --dns-result-order=ipv4first or --dns-result-order=ipv6first. If if you asked for ipv4 or ipv6 specifically you should get that if it is possible. If if was happy eyeballs was enabled by default, but was disabled if you set --dns-result-order specifically that likely addresses that specific case.
Since people did not specify a family, we can assume they have no preference (and it is optional in the RFC) and therefore we can skip those options for now and eventually add them later (as it would be semver-minor).
In order to make it RFC compliant I just have to try the connection in series (after 250ms as advised) rather than in parallel.
@tniessen Do you agree?
Since people did not specify a family, we can assume they have no preference (and it is optional in the RFC) and therefore we can skip those options for now and eventually add them later (as it would be semver-minor).
I’m not sure I follow. The --dns-result-order
flag already exists, so we should make sure that this PR doesn’t break it (as in, if the user specifies one family or the other via that flag, I would expect that request to be honored and would supersede any attempt at doing happy eyeballs).
Since people did not specify a family, we can assume they have no preference (and it is optional in the RFC) and therefore we can skip those options for now and eventually add them later (as it would be semver-minor).
I’m not sure I follow. The
--dns-result-order
flag already exists, so we should make sure that this PR doesn’t break it (as in, if the user specifies one family or the other via that flag, I would expect that request to be honored and would supersede any attempt at doing happy eyeballs).
Yes, but what I meant is that we can the RFC does not mandate to implement this so we can eventually postpone if we run out of time.
In order to make it RFC compliant I just have to try the connection in series (after 250ms as advised) rather than in parallel.
I am not quite sure if that completely eliminates the possibility of opening more than one connection.
Side note: we don't have to adhere to the RFCs at all if we don't call it Happy Eyeballs. But it seems that, where this implementation deviates from the RFCs, the spec certainly makes sense.
Makes sense. I'll implement the connection delay and then review how much we deviate from the spec.
We can eventually rename the feature and try to stay the close to the RFC as possible if needed.
Will update this soon, considering also Node Collab Summit and NodeConf.
CI: https://ci.nodejs.org/job/node-test-pull-request/47177/
Tests are still failing in GH Actions and in Jenkins. e.g. https://github.com/nodejs/node/actions/runs/3222806238/jobs/5272255739