scs icon indicating copy to clipboard operation
scs copied to clipboard

SCS segmentation fault

Open prateeky2806 opened this issue 6 years ago • 3 comments

Hi All -- I am solving an SDP program on say a matrix of n x n dimensions. I am using CVXPY with SCS solver and using GPU to solve the problem. The issue is that with n = 10000 when I ran the program it gave a segmentation fault (core dumped) error. The stdout looked like this

----------------------------------------------------------------------------
SCS v2.0.2 - Splitting Conic Solver
(c) Brendan O'Donoghue, Stanford University, 2012-2017
----------------------------------------------------------------------------
(c) Brendan O'Donoghue, Stanford University, 2012-2017
Lin-sys: sparse-indirect GPU, nnz in A = 99997194, CG tol ~ 1/iter^(2.00)
eps = 1.00e-05, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
acceleration_lookback = 20, rho_x = 1.00e-03
Variables n = 50005000, constraints m = 99987195
Cones:	primal zero / dual free vars: 49982195
	sd vars: 50005000, sd blks: 1
Setup time: 6.96e+00s
----------------------------------------------------------------------------
 Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
Segmentation fault (core dumped)

The same program works for n = 5000 and took about 3.5 hours to solve the SDP

prateeky2806 avatar Mar 25 '18 14:03 prateeky2806

My guess is that your GPU is running out of memory, sometimes it gives a nice error when this happens and sometimes it just seg-faults so it can be hard to detect. You might be able to get more of a handle on the memory usage by calling nvidia-smi. There are ways to decrease the amount of memory that SCS uses, in particular by setting the macro GPU_TRANSPOSE_MAT to false when compiling, and also using 32 bit floats (which can affect stability).

bodono avatar Mar 26 '18 11:03 bodono

Hi @bodono I was not able to figure out the exact issue, so currently I have stopped trying to scale the problem up. But In future when I start looking at it again. I'll update the status on this thread if I find something helpful.

prateeky2806 avatar Apr 05 '18 07:04 prateeky2806

OK, did you try the same problem with the indirect solver on CPU? If that doesn't segfault then it's likely a GPU memory issue.

bodono avatar Apr 07 '18 13:04 bodono