CutMix-PyTorch
CutMix-PyTorch copied to clipboard
Scripts for mentioned experiments
Hi, thanks for this great repo!
I am new to this area and I have a few questions regarding the experiments:
- Do you still have the scripts (or hyperparameter lists) for the experiments in the paper. For example, the ResNet 50 and ResNet 101 ImageNet dataset; PyramidNet-200 and PyramidNet-110 for Cifars.
- When I ran ResNet50 on Cifar 10, the number of parameters is 0.5M. while the number of parameters of ResNet50 on ImageNet is 25M. (This is the command I used: python train.py --net_type resnet --dataset cifar10 --depth 50 --alpha 240 --batch_size 64 --lr 0.25 --expname ResNet50 --epochs 300 --beta -1.0 --cutmix_prob 0.5 --no-verbose)
- How should I conduct transfer learning experiments?
Hope you can help me with this.
Thanks a lot!
- The script for ResNet50 on ImageNet is in README. Same as for ResNet101 on Imagnet training.
python train.py \
--net_type resnet \
--dataset imagenet \
--batch_size 256 \
--lr 0.1 \
--depth 50 \
--epochs 300 \
--expname ResNet50 \
-j 40 \
--beta 1.0 \
--cutmix_prob 1.0 \
--no-verbose
I'm not 100% sure, but the PyramidNet (110 or 200) training on cifar also be same as in the README.
python train.py \
--net_type pyramidnet \
--dataset cifar100 \
--depth 200 \
--alpha 240 \
--batch_size 64 \
--lr 0.25 \
--expname PyraNet200 \
--epochs 300 \
--beta 1.0 \
--cutmix_prob 0.5 \
--no-verbose
- This is because the structure of ResNet between
cifar
andImageNet
datasets is different. See https://github.com/clovaai/CutMix-PyTorch/blob/master/resnet.py#L87-L121 - First you train your pretrained models, or just download our pretrained models (https://github.com/clovaai/CutMix-PyTorch/blob/master/README.md#experiments), and fine-tune it on your own downstream datasets.
Thanks a lot for the reply.
For the 3rd question, do you still have the code for fine-tuning the pre-trained models on the downstream datasets? I cannot find it in the current repo.
This repo does not have downstream training/testing code. The instructions to train/test on downstream tasks are described in our paper.
Hi, have you experienced any performance gap between using one and two GPUS?