HomotopyContinuation.jl icon indicating copy to clipboard operation
HomotopyContinuation.jl copied to clipboard

`solve` never terminates with `threading=true`

Open AayushSabharwal opened this issue 1 year ago • 1 comments

This might be a platform-specific thing. Or related to #560 ?

MWE: (threading = false works as expected)

julia> using HomotopyContinuation
julia> @var x y z
julia> eqs = [
           x^2 + y^2 + 2x*y
           x^2 + 4x + 4
           y * z + 4x^2
       ]
julia> sys = System(eqs)
julia> sol = solve(sys)

This never terminates. I've had it running for upwards of 10 minutes. Nothing is printed either. Interrupting with Ctrl-C does nothing, but if I press it again I get:

julia> sol = solve(sys)
^C^CError showing value of type

And nothing more. Interrupting once more:

^C^CError showing value of type ^CERROR:

Then I interrupt several times more:

^C^CError showing value of type ^CERROR: ^C
^C^C^C^C^C^C^C^C^CWARNING: Force throwing a SIGINT
ERROR:

Then once more:

InterruptException()
safepoint at ./gcutils.jl:255 [inlined]
multiq_deletemin at ./partr.jl:134
trypoptask at ./task.jl:1004
jfptr_trypoptask_65985.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
get_next_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:377 [inlined]
ijl_task_get_next at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:438
poptask at ./task.jl:1012
wait at ./task.jl:1021
task_done_hook at ./task.jl:694
jfptr_task_done_hook_65908.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/./julia.h:2157 [inlined]
jl_finish_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:319
start_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:1213

And then a few more times:

^C^CError showing value of type ^CERROR: ^C
^C^C^C^C^C^C^C^C^CWARNING: Force throwing a SIGINT
ERROR: ^Cfatal: error thrown and no exception handler available.
InterruptException()
safepoint at ./gcutils.jl:255 [inlined]
multiq_deletemin at ./partr.jl:134
trypoptask at ./task.jl:1004
jfptr_trypoptask_65985.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
get_next_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:377 [inlined]
ijl_task_get_next at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:438
poptask at ./task.jl:1012
wait at ./task.jl:1021
task_done_hook at ./task.jl:694
jfptr_task_done_hook_65908.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/./julia.h:2157 [inlined]
jl_finish_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:319
start_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:1213
^CInterruptException:^C
atexit hook threw an error: InterruptException()
safepoint at ./gcutils.jl:255 [inlined]
multiq_deletemin at ./partr.jl:134
trypoptask at ./task.jl:1004
jfptr_trypoptask_65985.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
get_next_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:377 [inlined]
ijl_task_get_next at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/scheduler.c:438
poptask at ./task.jl:1012
wait at ./task.jl:1021
uv_write at ./stream.jl:1072
unsafe_write at ./stream.jl:1145
write at ./strings/io.jl:248 [inlined]
print at ./strings/io.jl:250 [inlined]
showerror at ./errorshow.jl:156
unknown function (ip: 0x34250005b)
_atexit at ./initdefs.jl:459
jfptr__atexit_68559.1 at /Users/aayush/.julia/juliaup/julia-1.11.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/./julia.h:2157 [inlined]
ijl_atexit_hook at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/init.c:271
ijl_exit at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/init.c:207
ijl_no_exc_handler at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:698
jl_finish_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:322
start_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-11/src/task.c:1213

Environment:

Project and Manifest
  Updating `/private/var/folders/8b/tx4_63kd6c51swrwfwj8g3mr0000gn/T/jl_RJPvX6/Project.toml`
  [f213a82b] + HomotopyContinuation v2.11.1
    Updating `/private/var/folders/8b/tx4_63kd6c51swrwfwj8g3mr0000gn/T/jl_RJPvX6/Manifest.toml`
  [398f06c4] + AbstractLattices v0.3.1
  [79e6a3ab] + Adapt v4.0.4
  [66dad0bd] + AliasTables v1.1.3
  [fb37089c] + Arblib v1.2.1
  [4fba245c] + ArrayInterface v7.16.0
  [62783981] + BitTwiddlingConvenienceFunctions v0.1.6
  [2a0fbf3d] + CPUSummary v0.2.6
  [d360d2e6] + ChainRulesCore v1.25.0
  [fb6a15b2] + CloseOpenIntervals v0.1.13
  [861a8166] + Combinatorics v1.0.2
  [38540f10] + CommonSolve v0.2.4
  [bbf7d656] + CommonSubexpressions v0.3.1
  [f70d9fcc] + CommonWorldInvalidations v1.0.0
  [34da2185] + Compat v4.16.0
  [187b0558] + ConstructionBase v1.5.8
  [adafc99b] + CpuId v0.3.1
  [a8cc5b0e] + Crayons v4.1.1
  [9a962f9c] + DataAPI v1.16.0
  [864edb3b] + DataStructures v0.18.20
  [e2d170a0] + DataValueInterfaces v1.0.0
  [8bb1440f] + DelimitedFiles v1.9.1
  [163ba53b] + DiffResults v1.1.0
  [b552c78f] + DiffRules v1.15.1
  [31c24e10] + Distributions v0.25.112
  [ffbed154] + DocStringExtensions v0.9.3
  [7c1d4256] + DynamicPolynomials v0.6.0
  [fdbdab4c] + ElasticArrays v1.2.12
  [1a297f60] + FillArrays v1.13.0
  [6a86dc24] + FiniteDiff v2.26.0
  [f6369f11] + ForwardDiff v0.10.36
  [f213a82b] + HomotopyContinuation v2.11.1
  [3e5b6fbb] + HostCPUFeatures v0.1.17
  [34004b35] + HypergeometricFunctions v0.3.24
  [615f187c] + IfElse v0.1.1
  [18e54dd8] + IntegerMathUtils v0.1.2
  [524e6230] + IntervalTrees v1.1.0
  [92d709cd] + IrrationalConstants v0.2.2
  [c8e1da08] + IterTools v1.10.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
  [692b3bcd] + JLLWrappers v1.6.1
  [8ac3fa9e] + LRUCache v1.6.1
  [b964fa9f] + LaTeXStrings v1.3.1
  [10f19ff3] + LayoutPointers v0.1.17
  [9c8b4983] + LightXML v0.9.1
  [d3d80556] + LineSearches v7.3.0
  [9b3f67b0] + LinearAlgebraX v0.2.10
  [2ab3a3ac] + LogExpFunctions v0.3.28
  [bdcacae8] + LoopVectorization v0.12.171
  [1914dd2f] + MacroTools v0.5.13
  [d125e4d3] + ManualMemory v0.1.8
  [e1d29d7a] + Missings v1.2.0
  [291d046c] + MixedSubdivisions v1.1.5
  [7475f97c] + Mods v2.2.5
  [3b2b4ff1] + Multisets v0.4.5
  [102ac46a] + MultivariatePolynomials v0.5.7
  [d8a4904e] + MutableArithmetics v1.5.0
  [d41bc354] + NLSolversBase v7.8.3
  [77ba4419] + NaNMath v1.0.2
  [6fe1bfb0] + OffsetArrays v1.14.1
  [429524aa] + Optim v1.9.4
  [bac558e1] + OrderedCollections v1.6.3
  [90014a1f] + PDMats v0.11.31
  [d96e819e] + Parameters v0.12.3
  [2ae35dd2] + Permutations v0.4.22
  [1d0040c9] + PolyesterWeave v0.2.2
  [f27b6e38] + Polynomials v4.0.11
  [85a6dd25] + PositiveFactorizations v0.2.4
  [aea7be01] + PrecompileTools v1.2.1
  [21216c6a] + Preferences v1.4.3
  [08abe8d2] + PrettyTables v2.4.0
  [27ebfcd6] + Primes v0.5.6
  [92933f4c] + ProgressMeter v1.10.2
  [01f381cc] + ProjectiveVectors v1.1.4
  [43287f4e] + PtrArrays v1.2.1
  [1fd47b50] + QuadGK v2.11.1
  [3cdcf5f2] + RecipesBase v1.3.4
  [189a3867] + Reexport v1.2.2
  [ae029012] + Requires v1.3.0
  [286e9d63] + RingLists v0.2.9
  [79098fc4] + Rmath v0.8.0
  [94e857df] + SIMDTypes v0.1.0
  [476501e8] + SLEEFPirates v0.6.43
  [8e049039] + SemialgebraicSets v0.3.3
  [efcf1570] + Setfield v1.1.1
  [55797a34] + SimpleGraphs v0.8.6
  [ec83eff0] + SimplePartitions v0.3.3
  [cc47b68c] + SimplePolynomials v0.2.18
  [a6525b86] + SimpleRandom v0.3.2
  [a2af1166] + SortingAlgorithms v1.2.1
  [276daf66] + SpecialFunctions v2.4.0
  [aedffcd0] + Static v1.1.1
  [0d7ed370] + StaticArrayInterface v1.8.0
  [90137ffa] + StaticArrays v1.9.7
  [1e83bf80] + StaticArraysCore v1.4.3
  [10745b16] + Statistics v1.11.1
  [82ae8749] + StatsAPI v1.7.0
  [2913bbd2] + StatsBase v0.34.3
  [4c63d2b9] + StatsFuns v1.3.2
  [892a3eda] + StringManipulation v0.4.0
  [09ab397b] + StructArrays v0.6.18
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.12.0
  [8290d209] + ThreadingUtilities v0.5.2
  [a2a6695c] + TreeViews v0.3.0
  [3a884ed6] + UnPack v1.0.2
  [3d5dd08c] + VectorizationBase v0.21.70
  [e134572f] + FLINT_jll v300.100.300+0
  [94ce4f54] + Libiconv_jll v1.17.0+0
  [2ce0c516] + MPC_jll v1.2.1+0
  [656ef2d0] + OpenBLAS32_jll v0.3.28+3
  [efe28fd5] + OpenSpecFun_jll v0.5.5+0
  [f50d1b31] + Rmath_jll v0.5.1+0
⌅ [3428059b] + SymEngine_jll v0.9.0+1
  [02c8fc9c] + XML2_jll v2.13.3+0
  [0dad84c5] + ArgTools v1.1.2
  [56f22d72] + Artifacts v1.11.0
  [2a0f44e3] + Base64 v1.11.0
  [ade2ca70] + Dates v1.11.0
  [8ba89e20] + Distributed v1.11.0
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching v1.11.0
  [9fa8497b] + Future v1.11.0
  [b77e0a4c] + InteractiveUtils v1.11.0
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2 v1.11.0
  [8f399da3] + Libdl v1.11.0
  [37e2e46d] + LinearAlgebra v1.11.0
  [56ddb016] + Logging v1.11.0
  [d6f4376e] + Markdown v1.11.0
  [a63ad114] + Mmap v1.11.0
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.11.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization v1.11.0
  [6462fe0b] + Sockets v1.11.0
  [2f01184e] + SparseArrays v1.11.0
  [4607b0f0] + SuiteSparse
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [8dfed614] + Test v1.11.0
  [cf7118a7] + UUIDs v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [e66e0078] + CompilerSupportLibraries_jll v1.1.1+0
  [781609d7] + GMP_jll v6.3.0+0
  [deac9b47] + LibCURL_jll v8.6.0+0
  [e37daf67] + LibGit2_jll v1.7.2+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [3a97d323] + MPFR_jll v4.2.1+0
  [c8ffd9c3] + MbedTLS_jll v2.28.6+0
  [14a3606d] + MozillaCACerts_jll v2023.12.12
  [4536629a] + OpenBLAS_jll v0.3.27+1
  [05823500] + OpenLibm_jll v0.8.1+2
  [bea87d4a] + SuiteSparse_jll v7.7.0+0
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850b90] + libblastrampoline_jll v5.11.0+0
  [8e850ede] + nghttp2_jll v1.59.0+0
  [3f19e933] + p7zip_jll v17.4.0+2

versioninfo():

julia> versioninfo()
Julia Version 1.11.0
Commit 501a4f25c2b (2024-10-07 11:40 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M2
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, apple-m2)
Threads: 16 default, 2 interactive, 4 GC (on 4 virtual cores)
Environment:
  JULIA_PKG_USE_CLI_GIT = true
  JULIA_PKG_SERVER = https://internal.juliahub.com

This happens on 1.10 as well.

AayushSabharwal avatar Oct 12 '24 12:10 AayushSabharwal

This is odd given that some tests include testing threading. I will look into it later this month (I have no time to do it earlier, sorry...). Thanks for noticing.

PBrdng avatar Oct 15 '24 06:10 PBrdng

@AayushSabharwal Can you run the code on your computer using #598? I use threading and solve terminates.

PBrdng avatar Oct 30 '24 15:10 PBrdng

Unfortunately it does not fix the issue. Thank you for looking into the problem though!

AayushSabharwal avatar Oct 30 '24 16:10 AayushSabharwal

The problem is that I can't reproduce this error (and hence not fix it). Here is my versioninfo():

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (x86_64-apple-darwin22.4.0)
  CPU: 4 × Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 2 default, 0 interactive, 1 GC (on 4 virtual cores)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 2

Any idea where I can change my setting so that I can reproduce the error?

PBrdng avatar Oct 30 '24 17:10 PBrdng

I don't think it's a setting thing unfortunately. It feels like a platform issue with Apple silicon. Considering that threading = true is the default for solve, I imagine if this was easy to hit you would've seen a lot more people complaining. I'll try and step through solve with the debugger when I have the time. If I'm able to isolate where it freezes, I'll update this thread.

AayushSabharwal avatar Oct 30 '24 17:10 AayushSabharwal

Ok. Thank you!

PBrdng avatar Oct 31 '24 12:10 PBrdng

I finally got around to this again :D the process freezes immediately after running https://github.com/JuliaHomotopyContinuation/HomotopyContinuation.jl/blob/1f344bd2efb76d8dfb0b89b0baab4f76b0917ccf/src/solve.jl#L606.

AayushSabharwal avatar Dec 11 '24 12:12 AayushSabharwal

Then then problem seems to be @tspawnat. This thread suggest to run Julia with 1 interactive thread for the REPL and $n$ for other tasks (so julia -t 8,1 for $n=8$). Can you try that?

PBrdng avatar Dec 13 '24 05:12 PBrdng

That worked! Thanks. Really weird that it causes the REPL to hang like that.

AayushSabharwal avatar Dec 14 '24 10:12 AayushSabharwal

Indeed weird. I will add a remark to the documentation.

PBrdng avatar Dec 16 '24 11:12 PBrdng