bert-squeeze
bert-squeeze copied to clipboard
Use Lightning dataloader hooks in soft and hard distillation
When performing distillation in soft
or hard
mode the way the datasets are concatenated is dubious.
Lightning offers a handy solution to use multiple datasets (see documentation), which will make code much cleaner and easier to understand.