train-CRF-RNN Error while run python solve.py 2>&1 | tee train.log. Check failed: error == cudaSuccess

Error while run python solve.py 2>&1 | tee train.log. Check failed: error == cudaSuccess

Open dongfeng951 opened this issue 7 years ago • 4 comments

I run the code and set the iteration as 200. The training process seems OK. But then Error occurs. I run it on GTX 1080 and on GTX870. But the error is the same. Did Anyone have the same problem?

I0610 17:14:30.009481 3575 solver.cpp:242] Iteration 100, loss = 1.15994e+06 I0610 17:14:30.009572 3575 solver.cpp:258] Train net output #0: loss-ft = 1.15994e+06 (* 1 = 1.15994e+06 loss) I0610 17:14:30.009583 3575 solver.cpp:571] Iteration 100, lr = 1e-13 I0610 17:14:58.422884 3575 solver.cpp:242] Iteration 150, loss = 273375 I0610 17:14:58.422955 3575 solver.cpp:258] Train net output #0: loss-ft = 273375 (* 1 = 273375 loss) I0610 17:14:58.422963 3575 solver.cpp:571] Iteration 150, lr = 1e-13 I0610 17:15:22.307466 3575 solver.cpp:449] Snapshotting to binary proto file models/train_iter_200.caffemodel I0610 17:15:24.137926 3575 solver.cpp:734] Snapshotting solver state to binary proto filemodels/train_iter_200.solverstate F0610 17:15:24.994758 3575 syncedmem.hpp:30] Check failed: error == cudaSuccess (11 vs. 0) invalid argument *** Check failure stack trace: ***

Jun 10 '17 09:06 dongfeng951

Did you find the solution to this? I have a similar problem after my first snapshot is saved.

Jun 24 '17 20:06 prakarshupmanyu

No. I just give up

Jul 13 '17 09:07 dongfeng951

@dongfeng951 @prakarshupmanyu did you fix it. I have the same error

Sep 29 '17 03:09 ThienAnh

@ThienAnh no man. I wasn't able to. I think I made my own program after trying this.

Sep 29 '17 05:09 prakarshupmanyu

train-CRF-RNN train-CRF-RNN copied to clipboard

Error while run python solve.py 2>&1 | tee train.log. Check failed: error == cudaSuccess

train-CRF-RNN
train-CRF-RNN copied to clipboard