CoFiPruning icon indicating copy to clipboard operation
CoFiPruning copied to clipboard

Why prepruning distillation?

Open mpiorczynski opened this issue 1 year ago • 1 comments

Hi, I have a question about the intuition behind the prepruning distillation step. Why are you not initializing the student model from the teacher weights, instead of initializing it from scratch (/pretrained on MLM BERT checkpoint)?

mpiorczynski avatar Nov 29 '23 23:11 mpiorczynski