PKD-for-BERT-Model-Compression icon indicating copy to clipboard operation
PKD-for-BERT-Model-Compression copied to clipboard

pytorch implementation for Patient Knowledge Distillation for BERT Model Compression

Results 4 PKD-for-BERT-Model-Compression issues
Sort by recently updated
recently updated
newest added

代码中有一个--teacher_prediction,这个哪来的?是在训练teacher模型中保存下来的?为什么没看到?

Hi, Thank you for your interesting work! I just wondering why don`t you used the pooler for only KD.Full and if you use the pooler, did you initialize the pooler...

Hi, Thank you for your interesting work! I have just started to learn BERT and distillation recently. I have some general questions regarding this topic. 1. I want to compare...

First, thank you for releasing your code. I am trying to reproduce results of your paper. I am running `NLI_KD_training.py` for MRPC with DEBUG=True. The setting I am running is...