RemoteREPL.jl
RemoteREPL.jl copied to clipboard
Significant overhead/latency (about 50ms)
I mentioned in a comment on this issue that I had some latency issues when using RemoteREPL for my Raspberry Pi. But I just checked using a local host, so no SSH, and having everything running on the same, modern computer. I found that there is STILL almost 50 ms of latency from just evaluating 1
and returning the result:
julia> @btime @remote 1
43.513 ms (66 allocations: 3.61 KiB)
1
By running using ProfileView
and then @profview @remote 1
, I get the following flamegraph:
From the top, the call-sites that make up the flamegraph are
./task.jl:795, MethodInstance for poptask(::Base.InvasiveLinkedListSynchronized{Task})
./task.jl:804, MethodInstance for wait()
./condition.jl:106, MethodInstance for wait(::Base.GenericCondition{Base.Threads.SpinLock})
./stream.jl:413, MethodInstance for wait_readnb(::Sockets.TCPSocket, ::Int64)
./stream.jl:106, eof [inlined]
./stream.jl:925, MethodInstance for read(::Sockets.TCPSocket, ::Type{UInt8})
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782, deserialize [inlined]
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:769, MethodInstance for deserialize(::Sockets.TCPSocket)
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:207, MethodInstance for var"#send_and_receive#40"(::Bool, ::typeof(RemoteREPL.send_and_receive), ::RemoteREPL.Connection, ::Tuple{Symbol, Int64})
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:199, send_and_receive [inlined]
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:382, MethodInstance for (::RemoteREPL.var"#47#48"{RemoteREPL.Connection, Int64})()
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:178, MethodInstance for var"#ensure_connected!#39"(::Int64, ::typeof(RemoteREPL.ensure_connected!), ::RemoteREPL.var"#47#48"{RemoteREPL.Connection, Int64}, ::RemoteREPL.Connection)
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:174, ensure_connected! [inlined]
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:380, MethodInstance for remote_eval_and_fetch(::RemoteREPL.Connection, ::Int64)
./boot.jl:360, eval [inlined]
I am not sure if this can be improved, or if this wait-time is necessary when dealing with networks. But investigations should be made into the possibility of avoiding this ~50 ms latency to every remote call.
https://github.com/JuliaLang/julia/issues/31842 ?
Naively 50 ms seems pretty crazy high on the loopback interface?
I expect this is more a Julia issue than a problem in this package but if we can invent a workaround that's great. Thanks @xgdgsc for the link :-)
The linked issue has a comment where Jeff says that the culprit is the "Nagle algorithm". It can be disabled:
help?> Sockets.nagle
nagle(socket::Union{TCPServer, TCPSocket}, enable::Bool)
Enables or disables Nagle's algorithm on a given TCP server or socket.
│ Julia 1.3
│
│ This function requires Julia 1.3 or later.
Should we use Sockets.nagle
to disable this algorithm by default? I have to imagine that generally we do not want a 50 ms delay, for the gain of fewer packets on a communication channel that is not used by multiple people.
Correct, you should not be using Nagle's algorithm for interactive sockets - it's intended for high-bandwidth, high-latency TCP connections (such as data downloads).
PR created. The effect was a 74x reduction in overhead, from adding a single line!