UER-py icon indicating copy to clipboard operation
UER-py copied to clipboard

LSTM的预训练模型的分词用的是什么

Open wanyuks opened this issue 3 years ago • 1 comments

wanyuks avatar Nov 17 '22 08:11 wanyuks

Unless otherwise noted, Chinese pre-trained models use BERT tokenizer and models/google_zh_vocab.txt as vocabulary (which is used in original BERT project).

zhezhaoa avatar Nov 18 '22 04:11 zhezhaoa