irhallac

Results 3 comments of irhallac

> Pretrain from scratch or modify the first ~1000 lines of the vocab.txt file with the vocab you'd like to add. I also need to add some a few thousands...

@peregilk thank you. In the model i downloaded there are only 100 [unusedXXX]-tokens in the vocab.txt not 1000. But you say 1000 can be changed ?

@peregilk btw I want to use the Bert model on Turkish language. I downloaded it from download_url = 'https://storage.googleapis.com/bert_models/2018_11_23/multi_cased_L-12_H-768_A-12.zip' and it is like this: ``` . [unused97] [unused98] [unused99] [UNK]...