DyRep
DyRep copied to clipboard
where can i download the pretrained model?
Sorry @TingquanGao ,
The trained checkpoints of our method were lost. Is it convenient for you to train the models using the code?
thx for your reply. i want to reproduce this model, so it is best if the pretrained could be provided. i would retrain using the code.
hi, because want to reproduce this work, i tried to train the resnet18 using image_classification_sota and the top1 acc got is 70.8 that is better than proposed in the paper(69.54). i dont know if my setting is wrong. waiting for your reply, thx.
script is:
python -m torch.distributed.launch --nproc_per_node=4 tools/train.py -c configs/strategies/resnet/resnet.yaml --model resnet18 --experiment imagenet_res18
because of 4 gpus used, i set the batch size of per gpu to 64. and others hyper-parameters are set by default.
btw, i found the settings of dyrep-resnet18 and resnet18 are different , such as learning rate decay, color jitter. is there any reasons for this?
Hi @TingquanGao ,
In our paper, we directly report the baseline ACC (69.54) trained by DBB[1]. For training strategy, the baseline
, DBB
, and DyRep
models are all trained using the same strategy (configs/strategies/DyRep/resnet.yaml), which is different to the official strategy in torchvision (configs/strategies/resnet/resnet.yaml); e.g., cosine lr decay
and color jitter
. We just follow DBB for fair comparisons.
For the reason why DBB authors used a stronger strategy, it is not explained in DBB paper. Personally, I guess a stronger strategy can show the superiority of DBB in representation ability better.
[1] Ding, X., Zhang, X., Han, J. and Ding, G., 2021. Diverse branch block: Building a convolution as an inception-like unit. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10886-10895).