loss_dropper
loss_dropper copied to clipboard
hi
If my loss is one-dimensional, do I need to become two-dimensional? use loss = loss.view(-1, batch_size)
I'm really interested in your great work. Just curious, If it is possible that combine BART with loss truncation? Cuz the vanilla LSTM with attention is kind of out-of-date.
I saw your paper at ACL and want to test it out in my MT/Summarization training (code):[https://github.com/huggingface/transformers/blob/master/examples/seq2seq/finetune.py] What should I pass as `weight` to `nn.NLLLoss` and what is the recommended...