Teacher-Assistant-Knowledge-Distillation icon indicating copy to clipboard operation
Teacher-Assistant-Knowledge-Distillation copied to clipboard

the performance of plain10 and plain2 on cifar100

Open cotyyang opened this issue 2 years ago • 0 comments

When I reproduce the performance of plain10 and plain2 on cifar100, I did a lot of experiments and couldn't reach the accuracy of Figure 4(b) in your paper. Therefore, I carefully read the experimental setup and found that We also used weight decay with the value of 0.0001 for training ResNets in your paper, so, can you tell me the specific method,for example weight-decay,learning-rate and crop=true or false.

cotyyang avatar Apr 29 '22 04:04 cotyyang