BERT4doc-Classification hight perplexity when Further Pre-Training

hight perplexity when Further Pre-Training

Open MrRace opened this issue 3 years ago • 1 comments

When do further pre-training on my own datas the ppl is too much high for example 709. I have 3582619 examples, and use batch size=8, epoch=3, learing rate=5e-5. Is there any advice ? Thanks a lot!

Oct 02 '21 04:10 MrRace

the further pre-trained task is masked language model, not language model, therefore using ppl i think may not be a good metric. can you set your batch size larger or using gradient accumulate? and you can check a accruacy of masked language model as well as the loss curve to check the further pre-training.

Oct 09 '21 19:10 xuyige

BERT4doc-Classification BERT4doc-Classification copied to clipboard

hight perplexity when Further Pre-Training

BERT4doc-Classification
BERT4doc-Classification copied to clipboard