Hou Lu

Results 1 comments of Hou Lu

The released code uses the width-adaptive teacher assistant at its largest width and depth (DynaBERTw, width_mult=1, depth_mult=1) as the teacher model. You can also use (DynaBERTw, width_mult, depth_mult=1) by inserting...