word-rnn icon indicating copy to clipboard operation
word-rnn copied to clipboard

training crashes with too big data

Open prazek opened this issue 7 years ago • 0 comments

Hi, I get a lot of errors like: torch/extra/cutorch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [37,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.

stack traceback: [C]: in function 'addmm' /home/prazek/torch/install/share/lua/5.1/nn/Linear.lua:66: in function 'func' ...e/prazek/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval' ...e/prazek/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward' train.lua:299: in function 'opfunc' /home/prazek/torch/install/share/lua/5.1/optim/rmsprop.lua:35: in function 'optimizer' train.lua:358: in main chunk [C]: in function 'dofile' ...azek/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405db0

It fails like this with glove. On the other hand without glove swaps whole machine in weird way (probably on gpu). Any guesses?

prazek avatar Apr 25 '17 21:04 prazek