academic-budget-bert
academic-budget-bert copied to clipboard
Repository containing code for "How to Train BERT with an Academic Budget" paper
Results
11
academic-budget-bert issues
Sort by
recently updated
recently updated
newest added
In the first epoch of pretraining, grad overflow happened in every iteration. Also, the evaluation loss of some epochs is null, after about the 17th epoch. It looks like the...