Im2Text
Im2Text copied to clipboard
Cuda runtime error
Hi, Nice repo. I am running the example for training with the given dataset. I am getting a cuda runtime error. I am attaching the log file.
Hmm I suspect there's something wrong with your cutorch. Can you try th -lcutorch -e "cutorch.test()"
and see the results?
"Completed 76020 asserts in 180 tests with 0 failures and 0 errors" I have tried it on two machines both had the error. I was able to test the model but not train it.
@arunpatala Unfortunately, I had again encountered the same "device-side assert triggered" problem on both Titan x pascal and Maxwell. I have cheched the cutorch, but didn't find any problems. Have you solved this problem ?
This problem may attribute to a recent update of cutorch https://github.com/torch/cutorch/issues/708. However, after adding CUDA_LAUNCH_BLOCKING=1, it fails in the same way as before.
Can you try that again? I figured out a bug that may lead to that problem. @SuperWu090
@da03 Thanks very much ! I have tested the program. This problem have been solved. However, due to the recent update of openNMT in Batch.lua (seems to be 1b7632a7799be84da0ef8e8407002484e38c0fe1), there seems to be a new problem "~/torch/install/bin/luajit: ~/torch/install/share/lua/5.1/onmt/data/Batch.lua:78: attempt to index a nil value" . This problem may be solved with the earlier version of openNMT (47431c773c2598384ea6f8c2200c25161f2eef12).