scs icon indicating copy to clipboard operation
scs copied to clipboard

Segmentation fault with SCS.jl and JuMP.jl

Open WellWellww opened this issue 2 years ago • 2 comments

  • SCS.jl v1.2.0

Hi, I was using JuMP.jl and SCS.jl to code and solve an optimization problem. The problem involved 234,932 optimization variables and 251,088 constraints. However, I encountered the following error:

[113945] signal (11.1): Segmentation fault
ldl_prepare at /workspace/srcdir/scs/linsys/cpu/direct/private.c:34 [inlined]
scs_init_lin_sys_work at /workspace/srcdir/scs/linsys/cpu/direct/private.c:237
init_work at /workspace/srcdir/scs/src/scs.c:890 [inlined]
scs_init at /workspace/srcdir/scs/src/scs.c:1227
scs_init at /public1/home/user/.julia/packages/SCS/owpZW/src/linear_solvers/direct.jl:25 [inlined]
_unsafe_scs_solve at /public1/home/user/.julia/packages/SCS/owpZW/src/c_wrapper.jl:390
#scs_solve#13 at /public1/home/user/.julia/packages/SCS/owpZW/src/c_wrapper.jl:349
scs_solve at /public1/home/user/.julia/packages/SCS/owpZW/src/c_wrapper.jl:278
unknown function (ip: 0x2ab7e4d97b4f)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
optimize! at /public1/home/user/.julia/packages/SCS/owpZW/src/MOI_wrapper/MOI_wrapper.jl:366
unknown function (ip: 0x2ab7e4d9355b)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
optimize! at /public1/home/user/.julia/packages/SCS/owpZW/src/MOI_wrapper/MOI_wrapper.jl:440
optimize! at /public1/home/user/.julia/packages/MathOptInterface/BlCD1/src/Utilities/cachingoptimizer.jl:316
unknown function (ip: 0x2ab7e4d7a062)
unknown function (ip: 0x2ab7e4d62309)
unknown function (ip: 0x2ab7e4d622aa)
optimize! at /public1/home/user/.julia/packages/MathOptInterface/BlCD1/src/Bridges/bridge_optimizer.jl:376 [inlined]
optimize! at /public1/home/user/.julia/packages/MathOptInterface/BlCD1/src/MathOptInterface.jl:85 [inlined]
optimize! at /public1/home/user/.julia/packages/MathOptInterface/BlCD1/src/Utilities/cachingoptimizer.jl:316
unknown function (ip: 0x2ab7e4d62272)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
#optimize!#113 at /public1/home/user/.julia/packages/JuMP/ptoff/src/optimizer_interface.jl:440
optimize! at /public1/home/user/.julia/packages/JuMP/ptoff/src/optimizer_interface.jl:410
jfptr_optimizeNOT._2915 at /public1/home/user/.julia/compiled/v1.9/JuMP/DmXqY_u22Yc.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
do_call at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1864
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1924
include at ./client.jl:478
unknown function (ip: 0x2ab7e4cd8272)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
do_call at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1864
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
_include at ./loading.jl:1924
include at ./Base.jl:457
jfptr_include_43521.clone_1 at /public1/home/user/julia/julia-1.9.0/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
exec_options at ./client.jl:307
_start at ./client.jl:522
jfptr__start_37386.clone_1 at /public1/home/user/julia/julia-1.9.0/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
true_main at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/jlapi.c:573
jl_repl_entrypoint at /cache/build/default-amdci4-0/julialang/julia-release-1-dot-9/src/jlapi.c:717
main at julia (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x401098)
Allocations: 122536122659 (Pool: 122441856243; Big: 94266416); GC: 93

I want to note that I had enough memory available and my system has a total memory of 2T with 64 cores, and the code only costs around 1T. Interestingly, when I worked on smaller-scale optimization problems (say, 79,054 variables and 84,145 constraints) constructed using the same principle, the program returned the correct results. Hence, I am curious if anyone has any insights into why this error is occurring. For instance, does this error occur because the scale of the problem is too large?

Other information:

julia> versioninfo()
Julia Version 1.9.0
Commit 8e630552924 (2023-05-07 11:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, cascadelake)
  Threads: 1 on 32 virtual cores
Environment:
  LD_LIBRARY_PATH = /public1/soft/intel/2015/impi/5.0.3.049/intel64/lib:/public1/soft/intel/2015/composer_xe_2015.6.233/debugger/libipt/intel64/lib:/public1/soft/intel/2015/composer_xe_2015.6.233/tbb/lib/intel64/gcc4.4:/public1/soft/intel/2015/composer_xe_2015.6.233/mkl/lib/intel64:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/tools/intel64/perfsys:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/lib/intel64:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/../compiler/lib/intel64:/public1/soft/intel/2015/composer_xe_2015.6.233/mpirt/lib/intel64:/public1/soft/intel/2015/composer_xe_2015.6.233/compiler/lib/intel64
  LD_LIBRARY_PATH_modshare = /public1/soft/intel/2015/composer_xe_2015.6.233/tbb/lib/intel64/gcc4.4:1:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/tools/intel64/perfsys:1:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/../compiler/lib/intel64:1:/public1/soft/intel/2015/composer_xe_2015.6.233/compiler/lib/intel64:1:/public1/soft/intel/2015/composer_xe_2015.6.233/debugger/libipt/intel64/lib:1:/public1/soft/intel/2015/composer_xe_2015.6.233/mkl/lib/intel64:1:/public1/soft/intel/2015/composer_xe_2015.6.233/mpirt/lib/intel64:1:/public1/soft/intel/2015/impi/5.0.3.049/intel64/lib:1:/public1/soft/intel/2015/composer_xe_2015.6.233/ipp/lib/intel64:1

WellWellww avatar Jun 28 '23 17:06 WellWellww

You may try import MKL_jll before using SCS and then try passing linear_solver=SCS.MKLDirectSolver;

On my cases (problem: variables n: 570684, constraints m: 1030531) as reported here SCS.DirectSolver run out of memory with a similar segfault. MKL solver used much less memory and solved the problem much faster. Give it a try!

kalmarek avatar Jun 28 '23 19:06 kalmarek

Depending on the sparsity the problem might be too large, it's a little unusual to OOM but it can happen. I agree with @kalmarek, you should try the MKL version of SCS which is typically faster and more memory efficient. Otherwise, you could post on the SCS.jl repo for help.

bodono avatar Jun 29 '23 09:06 bodono