dont-stop-pretraining
dont-stop-pretraining copied to clipboard
when do domain-adaptive pretraining, seems can not extend the vocabulary?
After use my own corpus to do domain-adaptive pretraining, the vocab.txt is the same size with the initialized model(BERT-base). In short, the domain-adaptive pretraining does not extend the vocabulary of the new domain? Therefore same specific
vocabulary of the new domain still not exist in the domain-adaptive pretraining result vocab.txt. Is that?