tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

Finetune Transformers Models with PyTorch Lightning: documentation error?

Open yfeng24816 opened this issue 2 years ago • 4 comments

When calculating the total steps, shouldn't we use number of batches * epoch size ? In this case, it would be self.total_steps = (len(train_loader.dataset) // tb_size) * ab_size instead of self.total_steps = (len(train_loader.dataset) // tb_size) // ab_size.

Please fix me if anywhere is wrong.

image

https://pytorchlightning.github.io/lightning-tutorials/notebooks/lightning_examples/text-transformers.html

cc @borda @rohitgr7

yfeng24816 avatar Feb 04 '22 10:02 yfeng24816

I guess it should be batches * num_epochs but why would it be * ab_size?

rohitgr7 avatar Feb 04 '22 12:02 rohitgr7

Is ab_size something like num_epochs? It becomes self.trainer.max_epochs when accumulate_grad_batches is 1.

yfeng24816 avatar Feb 04 '22 12:02 yfeng24816

okay. yes... I didn't see max_epochs there. should be something like

total = (total / accumulation_factor)*max_epochs.

rohitgr7 avatar Feb 04 '22 12:02 rohitgr7

So do you also think there is an error in the documentation too? I am not sure by my own.

yfeng24816 avatar Feb 05 '22 00:02 yfeng24816