Hou Lu
Results
1
comments of
Hou Lu
The released code uses the width-adaptive teacher assistant at its largest width and depth (DynaBERTw, width_mult=1, depth_mult=1) as the teacher model. You can also use (DynaBERTw, width_mult, depth_mult=1) by inserting...