Hang Zhang
Results
2
comments of
Hang Zhang
The teacher model for general distillation is BERT-base-uncased and the corpus is the original one, Totonto Book Corpus. You can search pretrained BERT model on huggingface as reference.
Hi, did anyone find the open source pre-trained models? Thank you!