Teacher-free-Knowledge-Distillation icon indicating copy to clipboard operation
Teacher-free-Knowledge-Distillation copied to clipboard

Knowledge Distillation: CVPR2020 Oral, Revisiting Knowledge Distillation via Label Smoothing Regularization

Results 21 Teacher-free-Knowledge-Distillation issues
Sort by recently updated
recently updated
newest added

Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...

dependencies

The dataset use in experiment section have many classes. Does this work (teacher free distill) for dataset with only two classes?

Hi, thank you for sharing such an awesome project. For the TF-reg KD, in [line 47 of my_loss_function.py](https://github.com/yuanli2333/Teacher-free-Knowledge-Distillation/blob/ecaa18475ebf657297fd340c5bdb312136b28313/my_loss_function.py#L47), should we also divide the temperature T on the output variable, like:...

Is the torch-vision version is correct in requirements.txt ? The version is wriiten as torchvision==0.4.0a0+9232c4a So it's giving an error. And one more important thing is to run this project...

I see that the default repo and the settings suit 32x32 input images. How can I make this work for images of larger size, (eg - 512x512) ?

Hello, How have you decided on the data augmentation transformations that you have applied on Tiny-ImageNet? Have you used the setting from some other paper? Thank you in advance.

Sorry to disturb, i wonder what's the difference between L_REG and LSR. In my opinion, both LSR and L_REG is the combination of H(p,q) and H(u, q) with certain weights...

Hi, Could you kindly check on source of pre-trained models? I can't download it from the website your provided, many thanks in advance.

I'm training the ImageNet model. The `loss_function.py` file doesn't contain the above two functions. Where can I find these?

Hi, I would first appreciate your work for interpreting the relationship between the KD and LSR. However, the baseline of ResNet18 on cifar100 is much lower than the implementation [pytorch-cifar100](https://github.com/weiaicunzai/pytorch-cifar100),...