Distributed worker manager doesn't use socket connection to infer worker ip
https://github.com/JuliaLang/julia/blob/0d00660a38f4d4049e12a97399e4ef613bf0d7dc/stdlib/Distributed/src/managers.jl#L568
for some reason we don't use the fact that we can call Sockets.getpeername() here, instead we read the stdout of the worker process.
This is problemmatic mainly because:
-
the worker nodes always report the first IPv4 interface's address no matter if that's actually the interface it used to contact main node: https://github.com/JuliaLang/julia/blob/0d00660a38f4d4049e12a97399e4ef613bf0d7dc/stdlib/Sockets/src/addrinfo.jl#L272-L276
-
the worker node may be running inside container (or whatever reason has virtual interface before everything else)
my questions: can we add a specialization for read_worker_host_port when config.io :: Sockets.TCPSocket?
bash-4.2$ route | grep '^default' | grep -o '[^ ]*$'
ens1f0.3604
shows that we should be using:
192.170.240.0
but the first IP address libuv came up with is 192.168.240.0;
I couldn't find how to look for the default interface in libuv
either we detect the default interface, or basically we need something like: https://github.com/JuliaWeb/IPNets.jl/blob/92a9364b4f12b4762ecfa3d6d233ab27aee6c5c4/src/IPNets.jl#L217
at this location: https://github.com/JuliaLang/julia/blob/0d00660a38f4d4049e12a97399e4ef613bf0d7dc/stdlib/Sockets/src/addrinfo.jl#L273
to filter out private IP range