when do domain-adaptive pretraining, seems can not extend the vocabulary？

Open MrRace opened this issue 4 years ago • 0 comments

After use my own corpus to do domain-adaptive pretraining, the vocab.txt is the same size with the initialized model(BERT-base). In short, the domain-adaptive pretraining does not extend the vocabulary of the new domain? Therefore same specific vocabulary of the new domain still not exist in the domain-adaptive pretraining result vocab.txt. Is that?

Jan 17 '22 08:01 MrRace