DeeBERT
DeeBERT copied to clipboard
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
When inference with the highway early-exit given a batch B; when |B| = 1, the code is ok to run; when |B| > 1, the code can corrupt in the...
Hello, I was running some experiments with DeeBERT and was wondering if there is any set of steps for adding new models beyond the ones currently included in the repository....
init_highway_pooler should only be called before training but not before evaluating.