loss_dropper
loss_dropper copied to clipboard
Is there any way to apply this work with pretrained model( e.g. BART, T5 ) ?
I'm really interested in your great work. Just curious, If it is possible that combine BART with loss truncation? Cuz the vanilla LSTM with attention is kind of out-of-date.
Hi @ElderWanng you can take a pretrained model and continue training it with loss truncation - we find it works quite well.