UER-py
UER-py copied to clipboard
LSTM的预训练模型的分词用的是什么
Unless otherwise noted, Chinese pre-trained models use BERT tokenizer and models/google_zh_vocab.txt as vocabulary (which is used in original BERT project).