torch-rnn icon indicating copy to clipboard operation
torch-rnn copied to clipboard

Error: error: wrote 0 blocks instead of 1

Open leostratus opened this issue 8 years ago • 1 comments

Hey folks, loving this package. From time to time I'll be running training and get the following error:

/home/ubuntu/torch-cl/install/bin/luajit: /home/ubuntu/torch-cl/install/share/lua/5.1/torch/File.lua:134: write error: wrote 0 blocks instead of 1 at /home/ubuntu/torch-cl/pkg/torch/lib/TH/THDiskFile.c:323
stack traceback:
    [C]: in function 'writeInt'
    /home/ubuntu/torch-cl/install/share/lua/5.1/torch/File.lua:134: in function 'writeObject'
    /home/ubuntu/torch-cl/install/share/lua/5.1/torch/File.lua:226: in function 'writeObject'
    /home/ubuntu/torch-cl/install/share/lua/5.1/torch/File.lua:226: in function 'writeObject'
    /home/ubuntu/torch-cl/install/share/lua/5.1/torch/File.lua:379: in function 'save'
    train.lua:242: in main chunk
    [C]: in function 'dofile'
    ...u/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

Note that this problem is not reliably reproduced - the epoch it fails on seems to not have a pattern. One time it was at epoch 5, another time epoch 33 etc.

Note that I have definitely successfully done training sessions on other data before with no errors. So, I do know that torch-rnn is installed correctly and functional.

The data set is about ~7MB, all plaintext that has been successfully run through preprocess.py.

The hyperparameters/invocation I'm using:

th train.lua -input_h5 data/data.h5 -input_json data/data.json -model_type lstm -num_layers 2 -rnn_size 128 -seq_length 80 -dropout 0.5 -learning_rate 3e-3 -lr_decay_factor 0.8

I've been experimenting to purposefully obtain weird results with changing parameters, so the above invocation might raise some eyebrows here for what kind of model it would generate. ;)

leostratus avatar Jun 20 '16 22:06 leostratus

I get this error as well:

/home/_/torch/install/bin/luajit: /home/_/torch/install/share/lua/5.1/torch/File.lua:210: write error: wrote 17069883 blocks instead of 28443465 at /tmp/luarocks_torch-scm-1-1419/torch7/lib/TH/THDiskFile.c:340 stack traceback: [C]: in function 'write' /home/_/torch/install/share/lua/5.1/torch/File.lua:210: in function </home/_/torch/install/share/lua/5.1/torch/File.lua:107> [C]: in function 'write' /home/_/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/_/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/_/torch/install/share/lua/5.1/nn/Module.lua:154: in function 'write' /home/_/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/_/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/_/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/_/torch/install/share/lua/5.1/nn/Module.lua:154: in function 'write' /home/_/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/_/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/_/torch/install/share/lua/5.1/torch/File.lua:388: in function 'save' train.lua:242: in main chunk [C]: in function 'dofile' ...than/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

Nyrt avatar Jan 12 '17 19:01 Nyrt