Hang Zhang

Results 2 comments of Hang Zhang

The teacher model for general distillation is BERT-base-uncased and the corpus is the original one, Totonto Book Corpus. You can search pretrained BERT model on huggingface as reference.

Hi, did anyone find the open source pre-trained models? Thank you!