Seq2Seq-Vis
Seq2Seq-Vis copied to clipboard
Training error with custom open nmt
I tried starting training for my data in the downloaded OpenNmt with the following flags,
python train.py -gpuid 2 -save_model training/models -data training/data/data -layers 6 \
-rnn_size 512 -word_vec_size 512 -epochs 30 -max_grad_norm 0 -optim adam \
-encoder_type transformer -decoder_type transformer -position_encoding \
-dropout 0.1 -param_init 0 -warmup_steps 12000 -learning_rate 0.2 \
-decay_method noam -label_smoothing 0.1 -adam_beta2 0.98 -batch_size 80 \
-start_decay_at 31
It throws the following error
* number of parameters: 91402259
encoder: 34166272
decoder: 57235987
Loading train dataset from training/data/data.train.1.pt, number of examples: 1872377
/home/sai/anaconda3/envs/s2sv/lib/python3.6/site-packages/torch/nn/modules/module.py:357: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
result = self.forward(*input, **kwargs)
Traceback (most recent call last):
File "train.py", line 405, in <module>
main()
File "train.py", line 401, in main
train_model(model, fields, optim, data_type, model_opt)
File "train.py", line 219, in train_model
train_stats = trainer.train(train_iter, epoch, report_func)
File "/home/sai/grammar/OpenNMT-py/onmt/Trainer.py", line 214, in train
report_stats, normalization)
File "/home/sai/grammar/OpenNMT-py/onmt/Trainer.py", line 179, in gradient_accumulation
self.model(src, tgt, src_lengths, dec_state)
File "/home/sai/anaconda3/envs/s2sv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/sai/grammar/OpenNMT-py/onmt/Models.py", line 561, in forward
context_lengths=lengths)
ValueError: not enough values to unpack (expected 4, got 3)
Any idea on how to fix this ? I am using conda 4.5.4, the setup script didn't install pytorch into the s2sv environment, so I just ran
conda install pytorch=0.3.1 -c soumith
after source activate .
As a test for training the model, can you use this commit on the main branch: https://github.com/OpenNMT/OpenNMT-py/commit/3bd519070ad3b092dc081d2e72d0cd129bd085c7 (or any older one). It is being refactored for pytorch 0.4 right now, and I will update Seq2SeqVis once this is done.
For the server I think we need to add some fixes for the transformer, since it is not quite clear which attention we want to visualize.