RepDistiller
RepDistiller copied to clipboard
Hyperparameter Settings for KD on Imagenet
To reproduce the baseline result on my machine (kd from rn34 to rn18), I would like to know the hyperparameter settings for knowledge distillation on Imagenet. Especially the weights for cross entropy loss and KLDiv loss, temperature, and batch_size. Thanks.