word-rnn icon indicating copy to clipboard operation
word-rnn copied to clipboard

Error when sampling

Open allthetime opened this issue 8 years ago • 4 comments

When running sample.lua against t7 files I frequently (but not always, depending on set temperature and seed text) come up against this error

/home/me/torch/install/bin/luajit: bad argument #2 to '?' (out of bounds at /home/me/torch/pkg/torch/lib/TH/generic/THStorage.c:178)
stack traceback:
    [C]: at 0x7f38999b08e0
    [C]: in function 'multinomial'
    sample.lua:170: in main chunk
    [C]: in function 'dofile'
    ...time/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

It comes up after some text has been predicted usually, and tends to show up sooner (less text predicted) when temperature is lower. Higher temperature lets more prediction through before the error occurs

It seems a similar issue exists(ed?) in char-nn https://github.com/karpathy/char-rnn/issues/28

From that thread: "The error means your data are naned. Two possible causes include the weights becoming naned during training, or the cv snapshot file being corrupted somehow."

Is there any way I can avoid this situation?

allthetime avatar Mar 10 '16 04:03 allthetime

Same error here, did you find a solution?

amitphadke avatar Feb 14 '17 18:02 amitphadke

Might be related to https://github.com/jcjohnson/torch-rnn/pull/195

I have been experiencing this error as well

bloons3 avatar Sep 18 '17 03:09 bloons3

I used torch-rnn and word-rnn with a 5MB dataset with no problems. I got the same error with a 8.2MB set.

lowtronik avatar Sep 18 '17 16:09 lowtronik

Tried to use word-rnn with on CPU and 56GB ram , no luck as well (before was on 6GB GPU)

lowtronik avatar Sep 18 '17 16:09 lowtronik