BiBERT
BiBERT copied to clipboard
Was Two Stage Knowledge Distillation used as in BinaryBERT?
Was Two Stage Knowledge Distillation used as in BinaryBERT in Table 7 (https://arxiv.org/pdf/2012.15701.pdf) to get these results?