DiffKD
DiffKD copied to clipboard
Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023
Knowledge Diffusion for Distillation (DiffKD)
Official implementation for paper "Knowledge Diffusion for Distillation" (DiffKD), NeurIPS 2023
Reproducing our results
git clone https://github.com/hunto/DiffKD.git --recurse-submodules
cd DiffKD
The implementation of DiffKD is in classification/lib/models/losses/diffkd.
- classification: prepare your environment and datasets following the
README.mdinclassification.
ImageNet
cd classification
sh tools/dist_train.sh 8 ${CONFIG} ${MODEL} --teacher-model ${T_MODEL} --experiment ${EXP_NAME}
Example script for reproducing DiffKD on ResNet-34 teacher and ResNet-18 student with B1 baseline setting:
sh tools/dist_train.sh 8 configs/strategies/distill/diffkd/diffkd_b1.yaml tv_resnet18 --teacher-model tv_resnet34 --experiment diffkd_res34_res18
- Baseline settings (
R34-R18andR50-MBV1):CONFIG=configs/strategies/distill/TODOStudent Teacher DiffKD MODEL T_MODEL Log Ckpt ResNet-18 (69.76) ResNet-34 (73.31) 72.20 tv_resnet18tv_resnet34log ckpt MobileNet V1 (70.13) ResNet-50 (76.16) 73.24 mobilenet_v1tv_resnet50to be reproduced
License
This project is released under the Apache 2.0 license.
Citation
@article{huang2023knowledge,
title={Knowledge Diffusion for Distillation},
author={Huang, Tao and Zhang, Yuan and Zheng, Mingkai and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang},
journal={arXiv preprint arXiv:2305.15712},
year={2023}
}