projected-attention-layers topic
List
projected-attention-layers repositories
BERT-Multitask-learning
15
Stars
3
Forks
Watchers
Multitask-learning of a BERT backbone. Allows to easily train a BERT model with state-of-the-art method such as PCGrad, Gradient Vaccine, PALs, Scheduling, Class imbalance handling and many optimizati...