lightning-transformers icon indicating copy to clipboard operation
lightning-transformers copied to clipboard

Ignore padding tokens when using Translation Task + `padding='max_length'`

Open SeanNaren opened this issue 3 years ago • 6 comments

🐛 Bug

When using the Translation Task, we need to ensure that we skip padding tokens within the loss calculation. Currently we do not replace the padding with -100, which could be deterimental.

SeanNaren avatar Jun 23 '22 12:06 SeanNaren

Hi @SeanNaren, I'm looking to contribute in ML-Software projects and have been using pytorch-lightning myself (read: I'm a fan!). Can you tell me where to get started for this issue? I'd like to scope if I can devote some of my time fixing this one.

spranjal25 avatar Aug 15 '22 16:08 spranjal25

@SeanNaren would you have some points on how/where to start? :rabbit:

Borda avatar Sep 14 '22 22:09 Borda

Hi @SeanNaren, @Borda, I think here is what is being asked to be modified.

I referred to this example. In here, we use TranslationTransformer for the training purpose, and it inherits from Seq2SeqTransformer. If we see this line, we see that the output is loss, logits, however here the loss is calculated taking the padding token into account.

I found the answer about how to solve it, and it is described by the Hugging Face community here.

So, I guess the change to be made is (in simple language), in the same line, i.e here:

  1. Obtain the loss, logits from the common step
  2. Initialize the Cross-Entropy loss with ignore_index = -100.
  3. Make the indexes of the target tokens which are 0 to -100
  4. Calculate the final loss and then perform the steps as usual.

Hope this helps in solving the issue.

uakarsh avatar Oct 07 '22 08:10 uakarsh

@spranjal25 are you fine with @uakarsh suggestion?

Borda avatar Nov 07 '22 09:11 Borda

@spranjal25 are you fine with @uakarsh suggestion?

Yes, that's very helpful. I think I can get started on it. Will pick this up ASA I get some free time, are we looking at a timeline here though @Borda?

spranjal25 avatar Nov 07 '22 10:11 spranjal25

not a special rush :)

Borda avatar Nov 07 '22 19:11 Borda