awd-lstm-lm icon indicating copy to clipboard operation
awd-lstm-lm copied to clipboard

Unpredictable behavior of adaptive softmax

Open songyuzhou324 opened this issue 6 years ago • 3 comments

The behavior of adaptive softmax is very unpredictable. Sometimes I can run through the whole code on dataset A at the first time, but got error message when training on dataset B with same format and schema. Then, if I switch back to dataset A, the code failed again. Here is the error message:

Traceback (most recent call last): File "main.py", line 244, in train() File "main.py", line 208, in train loss.backward() File "/opt/conda/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/opt/conda/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward variables, grad_variables, retain_graph) RuntimeError: invalid argument 3: Index tensor must have same dimensions as input tensor at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCTensorScatterGather.cu:199

This issue has blocked me for a long time. Please review it, thanks!

songyuzhou324 avatar May 02 '18 03:05 songyuzhou324

PS: after getting the error first time, I'm not able to run through any data except dataset with vocabulary < 75000, which result in regular softmax. Thus, something should be fixed in splitcross.py.

songyuzhou324 avatar May 02 '18 04:05 songyuzhou324

Can you try on the commit https://github.com/salesforce/awd-lstm-lm/tree/bf0742cab41d8bf4cd817acfe7e5e0cbff4131ba ? If that works, I can help you with getting the improvements from that commit for low-vocabulary datasets.

keskarnitish avatar May 08 '18 20:05 keskarnitish

I tried this and it seems to work. I am trying to use my own dataset (cantab-tedlium) to train using adaptive-softmax and it crashes often.

kushalarora avatar May 25 '18 15:05 kushalarora