char-rnn icon indicating copy to clipboard operation
char-rnn copied to clipboard

memory problems and frequent crashes

Open Nyrt opened this issue 9 years ago • 1 comments

When I try to train any network larger than the default size on GPU, I pretty quickly run into the error: cuda runtime error (77) : an illegal memory access was encountered at /tmp/luarocks_cutorch-scm-1-6753/cutorch/lib/THC/generic/THCStorage.c:147 I know I'm not running out of memory, so I assume this is some kind of segmentation fault.

It seems to work fine on the CPU, which implies that the problem is with cutorch (as the error message suggests), but since I'm doing this all on my personal computer and CPU training is an order of magnitude slower, I'd like to get GPU training working again.

Nyrt avatar Dec 30 '15 07:12 Nyrt

Interestingly, after reinstalling everything, this now only occurs when training on my second GPU (-gpuid 1) but not on GPU 0, which is frusturating, because GPU 1 is a little bit faster. Better than not working at all though.

Nyrt avatar Feb 05 '16 05:02 Nyrt