PL-BERT
PL-BERT copied to clipboard
What should be ideal range of Loss while training PL-ERT (any new language)
I am training PLBERT for 5000000 number of epochs but my loss in not decreasing and always giving 8.something. Please let us know the ideal range of loss for PLBERT training.
Step [2233480/5000000], Loss: 8.51781, Vocab Loss: 6.68871, Token Loss: 1.66403 Step [2233490/5000000], Loss: 8.26963, Vocab Loss: 6.07282, Token Loss: 1.52627 Step [2233500/5000000], Loss: 8.77817, Vocab Loss: 5.29348, Token Loss: 2.13625 Step [2233510/5000000], Loss: 8.51580, Vocab Loss: 6.69523, Token Loss: 2.22209 Step [2233520/5000000], Loss: 8.72296, Vocab Loss: 6.78622, Token Loss: 2.18462 Step [2233530/5000000], Loss: 8.26051, Vocab Loss: 6.93161, Token Loss: 1.61516