train-CRF-RNN
train-CRF-RNN copied to clipboard
Error while run python solve.py 2>&1 | tee train.log. Check failed: error == cudaSuccess
I run the code and set the iteration as 200. The training process seems OK. But then Error occurs. I run it on GTX 1080 and on GTX870. But the error is the same. Did Anyone have the same problem?
I0610 17:14:30.009481 3575 solver.cpp:242] Iteration 100, loss = 1.15994e+06 I0610 17:14:30.009572 3575 solver.cpp:258] Train net output #0: loss-ft = 1.15994e+06 (* 1 = 1.15994e+06 loss) I0610 17:14:30.009583 3575 solver.cpp:571] Iteration 100, lr = 1e-13 I0610 17:14:58.422884 3575 solver.cpp:242] Iteration 150, loss = 273375 I0610 17:14:58.422955 3575 solver.cpp:258] Train net output #0: loss-ft = 273375 (* 1 = 273375 loss) I0610 17:14:58.422963 3575 solver.cpp:571] Iteration 150, lr = 1e-13 I0610 17:15:22.307466 3575 solver.cpp:449] Snapshotting to binary proto file models/train_iter_200.caffemodel I0610 17:15:24.137926 3575 solver.cpp:734] Snapshotting solver state to binary proto filemodels/train_iter_200.solverstate F0610 17:15:24.994758 3575 syncedmem.hpp:30] Check failed: error == cudaSuccess (11 vs. 0) invalid argument *** Check failure stack trace: ***
Did you find the solution to this? I have a similar problem after my first snapshot is saved.
No. I just give up
@dongfeng951 @prakarshupmanyu did you fix it. I have the same error
@ThienAnh no man. I wasn't able to. I think I made my own program after trying this.