julia
julia copied to clipboard
CI: On `i686-linux-gnu`, the `Downloads` test set is failing frequently (but not always) with `nested task error: RequestError: getaddrinfo() thread failed to start while requesting https://httpbingo.julialang.org/delay/2?id=100`
The error seems to be: nested task error: RequestError: getaddrinfo() thread failed to start while requesting https://httpbingo.julialang.org/delay/2?id=100
Relevant log snippet:
Error in testset Downloads:
Error During Test at /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/runtests.jl:355
Got exception outside of a @test
TaskFailedException
Stacktrace:
[1] macro expansion
@ ./experimental.jl:102 [inlined]
[2] macro expansion
@ ./timing.jl:394 [inlined]
[3] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/runtests.jl:366 [inlined]
[4] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Test/src/Test.jl:1388 [inlined]
[5] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/runtests.jl:356 [inlined]
[6] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Test/src/Test.jl:1388 [inlined]
[7] top-level scope
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/runtests.jl:4
[8] include
@ ./Base.jl:424 [inlined]
[9] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/test/testdefs.jl:24 [inlined]
[10] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Test/src/Test.jl:1388 [inlined]
[11] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/test/testdefs.jl:23 [inlined]
[12] macro expansion
@ ./timing.jl:475 [inlined]
[13] runtests(name::String, path::String, isolate::Bool; seed::UInt128)
@ Main /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/test/testdefs.jl:21
[14] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, UInt128, Tuple{Symbol}, NamedTuple{(:seed,), Tuple{UInt128}}})
@ Base ./essentials.jl:810
[15] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
@ Distributed /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285
[16] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:70
[17] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Distributed/src/process_messages.jl:285 [inlined]
[18] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed ./task.jl:501
nested task error: RequestError: getaddrinfo() thread failed to start while requesting https://httpbingo.julialang.org/delay/2?id=100
Stacktrace:
[1] (::Downloads.var"#9#18"{IOBuffer, Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Nothing, Bool, String, Bool, Bool})(easy::Downloads.Curl.Easy)
@ Downloads /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:388
[2] with_handle(f::Downloads.var"#9#18"{IOBuffer, Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Nothing, Bool, String, Bool, Bool}, handle::Downloads.Curl.Easy)
@ Downloads.Curl /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Curl/Curl.jl:88
[3] #8
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:329 [inlined]
[4] arg_write(f::Downloads.var"#8#17"{Base.DevNull, Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Nothing, Bool, String, Bool, Bool}, arg::IOBuffer)
@ ArgTools /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/ArgTools/src/ArgTools.jl:134
[5] #7
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:328 [inlined]
[6] arg_read
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/ArgTools/src/ArgTools.jl:76 [inlined]
[7] request(url::String; input::Nothing, output::IOBuffer, method::Nothing, headers::Vector{Pair{String, String}}, timeout::Float64, progress::Nothing, verbose::Bool, debug::Nothing, throw::Bool, downloader::Downloads.Downloader)
@ Downloads /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:327
[8] #3
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:233 [inlined]
[9] arg_write(f::Downloads.var"#3#4"{Nothing, Vector{Pair{String, String}}, Float64, Nothing, Bool, Nothing, Downloads.Downloader, String}, arg::IOBuffer)
@ ArgTools /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/ArgTools/src/ArgTools.jl:134
[10] #download#2
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/src/Downloads.jl:232 [inlined]
[11] (::Main.Test42Main_Downloads.var"#2#3"{Base.Pairs{Symbol, Downloads.Downloader, Tuple{Symbol}, NamedTuple{(:downloader,), Tuple{Downloads.Downloader}}}, String})(output::IOBuffer)
@ Main.Test42Main_Downloads /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/setup.jl:13
[12] sprint
@ ./strings/io.jl:107 [inlined]
[13] #download_body#1
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/setup.jl:12 [inlined]
[14] #download_json#4
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/setup.jl:18 [inlined]
[15] macro expansion
@ /cache/build/default-amdci5-0/julialang/julia-master/julia-9bdaabdc82/share/julia/stdlib/v1.9/Downloads/test/runtests.jl:368 [inlined]
[16] (::Main.Test42Main_Downloads.var"#24#50"{Downloads.Downloader, String, Int32})()
@ Main.Test42Main_Downloads ./task.jl:501
Example log: https://buildkite.com/julialang/julia-master/builds/14198#01822720-ae76-4ff4-8a35-cec678dfa6a7
https://github.com/JuliaCI/julia-buildkite/pull/185 may fix this, so would be good to check
This Win64 job failed with
nested task error: RequestError: Failed to connect to httpbingo.julialang.org port 443 after 21058 ms: Timed out while requesting https://httpbingo.julialang.org/delay/2?id=40
which looks similar enough?
Sadly, I am still seeing the original failure (nested task error: RequestError: getaddrinfo() thread failed to start while requesting https://httpbingo.julialang.org/delay/2?id=98
) on i686-linux-gnu
even after I merged https://github.com/JuliaCI/julia-buildkite/pull/185.
Example log: https://buildkite.com/julialang/julia-master/builds/14331#01823cdc-7767-4c4b-bcc8-976db68868de
In the above log, it says that JULIA_NUM_THREADS
was set to 1
, and that the value of nthreads()
was 1
, so it does look like threads were disabled in that job.
This Win64 job failed with
nested task error: RequestError: Failed to connect to httpbingo.julialang.org port 443 after 21058 ms: Timed out while requesting https://httpbingo.julialang.org/delay/2?id=40
which looks similar enough?
Hmmm. I don't think that's the same failure.
I did see the exact same failure as the original post (nested task error: RequestError: getaddrinfo() thread failed to start while requesting https://httpbingo.julialang.org/delay/2?id=1
) just now on Buildbot on i686-w64-mingw32
.
Log: https://build.julialang.org/#/builders/42/builds/7253
I should point out that even though this failure didn't go away once I merged https://github.com/JuliaCI/julia-buildkite/pull/185, it certainly did become less frequent.
I continue to see this on i686-linux-gnu
.
@vtjnash @StefanKarpinski Could this be a libuv issue? We are using libuv's getaddrinfo
, right?
Looking closer, the test fails at 20-30 seconds, but the expected maximum time is 180 seconds: https://github.com/JuliaLang/Downloads.jl/issues/204
Thus, we are simply running too many downloads concurrently, and mbedtls is too slow to keep up.
Interestingly, same error as I saw here: https://github.com/JuliaLang/Downloads.jl/issues/196#issuecomment-1255391151. Seems like there's some race condition in libcurl that we're hitting in some cases. In the link, putting a try/catch in a cfunction someone causes it to be triggered more easily.
Ah, sorry, I was responding to https://github.com/JuliaLang/julia/issues/46159#issuecomment-1194557004, which is a different issue (that happens to be in roughly the same place, for related but not identical reasons)
This semaphore may help: https://github.com/JuliaLang/Downloads.jl/pull/205.
You can test this on your own machine (both issues here should turn out to have the same fix) by changing the 100 concurrent attempts to 1500 (or even higher) in this test