ARAE
ARAE copied to clipboard
Error when running yelp/train.py
I followed README.md and ran
python train.py --data_path ./data
But then I got the following errors:
{'dropout': 0.0, 'lr_ae': 1, 'load_vocab': '', 'nlayers': 1, 'batch_size': 64, 'beta1': 0.5, 'gan_gp_lambda': 0.1, 'nhidden': 128, 'vocab_size': 30000, 'niters_gan_schedule': '', 'niters_gan_d': 5, 'lr_gan_d': 0.0001, 'grad_lambda': 0.01, 'sample': False, 'arch_classify': '128-128', 'clip': 1, 'hidden_init': False, 'cuda': True, 'log_interval': 200, 'device_id': '0', 'temp': 1, 'seed': 1111, 'maxlen': 25, 'lowercase': True, 'data_path': './data', 'lambda_class': 1, 'lr_classify': 0.0001, 'outf': 'yelp_example', 'noise_r': 0.1, 'noise_anneal': 0.9995, 'lr_gan_g': 0.0001, 'niters_gan_g': 1, 'arch_g': '128-128', 'z_size': 32, 'epochs': 25, 'niters_ae': 1, 'arch_d': '128-128', 'emsize': 128, 'niters_gan_ae': 1}
Original vocab 9599; Pruned to 9603
Number of sentences dropped from ./data/valid1.txt: 0 out of 38205 total
Number of sentences dropped from ./data/valid2.txt: 0 out of 25278 total
Number of sentences dropped from ./data/train1.txt: 0 out of 267314 total
Number of sentences dropped from ./data/train2.txt: 0 out of 176787 total
Vocabulary Size: 9603
382 batches
252 batches
4176 batches
2762 batches
Loaded data!
Seq2Seq2Decoder(
(embedding): Embedding(9603, 128)
(embedding_decoder1): Embedding(9603, 128)
(embedding_decoder2): Embedding(9603, 128)
(encoder): LSTM(128, 128, batch_first=True)
(decoder1): LSTM(256, 128, batch_first=True)
(decoder2): LSTM(256, 128, batch_first=True)
(linear): Linear(in_features=128, out_features=9603, bias=True)
)
MLP_G(
(layer1): Linear(in_features=32, out_features=128, bias=True)
(bn1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation1): ReLU()
(layer2): Linear(in_features=128, out_features=128, bias=True)
(bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation2): ReLU()
(layer7): Linear(in_features=128, out_features=128, bias=True)
)
MLP_D(
(layer1): Linear(in_features=128, out_features=128, bias=True)
(activation1): LeakyReLU(negative_slope=0.2)
(layer2): Linear(in_features=128, out_features=128, bias=True)
(bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation2): LeakyReLU(negative_slope=0.2)
(layer6): Linear(in_features=128, out_features=1, bias=True)
)
MLP_Classify(
(layer1): Linear(in_features=128, out_features=128, bias=True)
(activation1): ReLU()
(layer2): Linear(in_features=128, out_features=128, bias=True)
(bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation2): ReLU()
(layer6): Linear(in_features=128, out_features=1, bias=True)
)
Training...
Traceback (most recent call last):
File "train.py", line 574, in <module>
train_ae(1, train1_data[niter], total_loss_ae1, start_time, niter)
File "train.py", line 400, in train_ae
output = autoencoder(whichdecoder, source, lengths, noise=True)
File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
result = self.forward(*input, **kwargs)
File "/groups/branson/home/imd/Documents/project/ARAE/yelp/models.py", line 143, in forward
hidden = self.encode(indices, lengths, noise)
File "/groups/branson/home/imd/Documents/project/ARAE/yelp/models.py", line 160, in encode
batch_first=True)
File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/onnx/__init__.py", line 56, in wrapper
if not might_trace(args):
File "/localhome/imd/anaconda2/envs/Pytorch/lib/python3.5/site-packages/torch/onnx/__init__.py", line 130, in might_trace
first_arg = args[0]
IndexError: tuple index out of range
Hmm, could you try maybe run with python3?
I've run into the same issue. Python 3.5.2 torch==0.4.1
Training...
Traceback (most recent call last):
File "train.py", line 574, in <module>
train_ae(1, train1_data[niter], total_loss_ae1, start_time, niter)
File "train.py", line 400, in train_ae
output = autoencoder(whichdecoder, source, lengths, noise=True)
File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/v2john/ARAE/yelp/models.py", line 143, in forward
hidden = self.encode(indices, lengths, noise)
File "/home/v2john/ARAE/yelp/models.py", line 160, in encode
batch_first=True)
File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/onnx/__init__.py", line 67, in wrapper
if not might_trace(args):
File "/home/v2john/.pyenv/lib/python3.5/site-packages/torch/onnx/__init__.py", line 141, in might_trace
first_arg = args[0]
IndexError: tuple index out of range
Python3 clearly isn't the fix. It seems like something about the PyTorch + ONNX interop is broken. Is there a specific version of PyTorch that's needed to run this?
@jiwoongim
You can try using my forked version of the repository to see if it fixes the issue for you. I've verified it to be working for Python 3.5.2 and PyTorch 0.4.1 https://github.com/vineetjohn/arae
I've not identified the actual problem yet, but I've added a workaround that avoids having to deal with ONNX altogether. The pack_padded_sequence
method in torch.nn.utils.rnn
seems to be buggy.
Guys can you try python 3.6? @jiwoongim @vineetjohn
@jiwoongim You can try using my forked version of the repository, I have resolved the issue by doing several changes to the original code. I have verified it to be working for python 3.6.5 and PyTorch 0.4.1 https://github.com/rainyrainyguo/ARAE
@jakezhaojb
This doesn't look like a Python version issue. The named arguments used in this project vs. those accepted by PyTorch 0.4.1 are inconsistent.
You should consider adding the version of PyTorch used to perform your experiments, to the project README.
@vineetjohn Good point! I used PyTorch 0.3.1. I'm adding this to the README
@rainyrainyguo
I have run your forked version in python 3.6.5 with PyTorch 0.4.1 (Cudnn=7.1.3, Cudatoolkit=8.0) and I have a error as follow:
Training ....
run_oneb.py:256: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_
.
run_oneb.py:259: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() t
o convert a 0-dim tensor to a Python number
run_oneb.py:263: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() t
o convert a 0-dim tensor to a Python number
| epoch 1 | 0/ 765 batches | ms/batch 0.61 | loss 0.05 | ppl 1.05 | acc 0.00
Traceback (most recent call last):
File "run_oneb.py", line 102, in
Can you give me some advices?
@dangvanthin Hi, I met the same problem. Do you have the solution right now? Thank you