sparseml
sparseml copied to clipboard
Created base class for knowledge distillation
This PR creates a base class for knowledge distillation modifiers and changes the existing knowledge distillation modifier to inherit from the base class. The goal of this change is to enable other forms of knowledge distillation by simply implementing specialized instances of distillation losses.