cls_KD
cls_KD copied to clipboard
About DeiT-Tiny model in your experiment for ViTKD
Hello,
Thank you for your great work! I am very impressed by your research. According to your ViT KD paper, the reported baseline performance of DeiT-tiny and DeiT-small in your experiments seems significantly higher than that of typical models.
As far as I know, The officially released DeiT-tiny usually performs around 72% and DeiT-small around 80% in the official DeiT code. https://github.com/facebookresearch/deit/blob/main/README_deit.md
Could you please clarify which models you used and what differences they have compared to the original models? Your response would be extremely helpful.
Thank you.