SimpleAICV_pytorch_training_examples icon indicating copy to clipboard operation
SimpleAICV_pytorch_training_examples copied to clipboard

SimpleAICV:pytorch training and testing examples.

  • My ZhiHu column
  • Introduction
  • Environments
  • Prepare datasets
  • Download my pretrained models
  • Train and test model
  • Classification training results
    • ILSVRC2012(ImageNet) training results
    • CIFAR100 training results
  • Detection training results
    • COCO2017 training results
    • VOC2007 and VOC2012 training results
  • Distillation training results
    • ImageNet training results
  • Citation

My ZhiHu column


This repository provides training and testing examples for image classification, object detection, and knowledge distillation tasks.

image classification:

ResNet(include autoaugment/randaugment)

object detection:


knowledge distillation:

KD loss
DKD loss
DML loss


This repository only support one server one gpu card/one server multi gpu cards.


Ubuntu 20.04.3 LTS,30 core AMD EPYC 7543 32-Core Processor, 2*RTX A5000, Python Version:3.8, CUDA Version:11.3

Please make sure your Python version>=3.7.

Use pip or conda to install those Packages:


How to install apex?

apex needs to be installed separately.First,download apex:

git clone

For torch1.10,modify apex/apex/amp/

if cached_x.grad_fn.next_functions[1][0].variable is not x:


if cached_x.grad_fn.next_functions[0][0].variable is not x:

Then use the following orders to install apex:

cd apex
pip install -v --no-cache-dir ./

Using apex to train can reduce video memory usage by 25%-30%, but the training speed will be slower, the trained model has the same performance as not using apex.

Prepare datasets

If you want to reproduce my imagenet pretrained models,you need download ILSVRC2012 dataset,and make sure the folder architecture as follows:

|-----train----1000 sub classes folders
|-----val------1000 sub classes folders
Please make sure the same class has same class folder name in train and val folders.

If you want to reproduce my cifar100 pretrained models,you need download cifar100 dataset,and make sure the folder architecture as follows:

|-----train unzip from cifar-100-python.tar.gz
|-----test  unzip from cifar-100-python.tar.gz
|-----meta  unzip from cifar-100-python.tar.gz

If you want to reproduce my COCO2017 pretrained models,you need download COCO2017 dataset,and make sure the folder architecture as follows:

|                |----captions_train2017.json
|                |----captions_val2017.json
|                |----instances_val2017.json
|                |----person_keypoints_train2017.json
|                |----person_keypoints_val2017.json
|                |----train2017

If you want to reproduce my VOC2007/VOC2012 pretrained models,you need download VOC2007+VOC2012 dataset,and make sure the folder architecture as follows:

|                 |----Annotations
|                 |----ImageSets
|                 |----SegmentationClass
|                 |----SegmentationObject
|                 |----Annotations
|                 |----ImageSets
|                 |----SegmentationClass
|                 |----SegmentationObject

If you want to reproduce my ADE20K pretrained models,you need download ADE20K dataset,and make sure the folder architecture as follows:

|                 |----training
|                 |----testing
|                 |----training

Download datasets and my pretrained models

You can download all my pretrained models from Baidu-Netdisk:

# datasets
# my pretrained models and records

Train and test model

If you want to train or test model,you need enter a training folder directory,then run and

For example,you can enter classification_training/imagenet/resnet50.

If you want to train this model from scratch,please delete checkpoints and log folders first,then run

CUDA_VISIBLE_DEVICES=0,1 python -m --nproc_per_node=2 --master_addr --master_port 10000 ../../../tools/ --work-dir ./

CUDA_VISIBLE_DEVICES is used to specify the gpu ids for this training.Please make sure the number of nproc_per_node equal to the number of gpu cards.Make sure master_addr/master_port are unique for each training.

if you want to test this model,you need have a pretrained model first,modify trained_model_path in,then run

CUDA_VISIBLE_DEVICES=0,1 python -m --nproc_per_node=2 --master_addr --master_port 10000 ../../../tools/ --work-dir ./

Also, You can modify super parameters in

Classification training results

ILSVRC2012(ImageNet) training results

Network macs params input size gpu num batch warm up lr decay apex syncbn epochs Top-1
ResNet18 1.819G 11.690M 224x224 2 RTX A5000 256 0 multistep True False 100 70.712
ResNet34half 949.323M 5.585M 224x224 2 RTX A5000 256 0 multistep True False 100 67.752
ResNet34 3.671G 21.798M 224x224 2 RTX A5000 256 0 multistep True False 100 73.752
ResNet50half 1.063G 6.918M 224x224 2 RTX A5000 256 0 multistep True False 100 72.902
ResNet50 4.112G 25.557M 224x224 2 RTX A5000 256 0 multistep True False 100 76.264
ResNet101 7.834G 44.549M 224x224 2 RTX A5000 256 0 multistep True False 100 77.322
ResNet152 11.559G 60.193M 224x224 2 RTX A5000 256 0 multistep True False 100 78.006
ResNet50-200epoch 4.112G 25.557M 224x224 2 RTX A5000 256 5 cosinelr True False 200 76.986
ResNet50-autoaugment 4.112G 25.557M 224x224 2 RTX A5000 256 5 cosinelr True False 200 77.736
ResNet50-randaugment 4.112G 25.557M 224x224 2 RTX A5000 256 5 cosinelr True False 200 77.846
DarkNetTiny 412.537M 2.087M 256x256 2 RTX A5000 256 0 multistep True False 100 57.602
DarkNet19 3.663G 20.842M 256x256 2 RTX A5000 256 0 multistep True False 100 74.028
DarkNet53 9.322G 41.610M 256x256 2 RTX A5000 256 0 multistep True False 100 76.602
RepVGG_A0_deploy 1.362G 8.309M 224x224 2 RTX A5000 256 0 cosinelr True False 120 72.156
RepVGG_A1_deploy 2.364G 12.790M 224x224 2 RTX A5000 256 0 cosinelr True False 120 74.056
RepVGG_A2_deploy 5.117G 25.500M 224x224 2 RTX A5000 256 0 cosinelr True False 120 76.022
RepVGG_B0_deploy 3.058G 14.339M 224x224 2 RTX A5000 256 0 cosinelr True False 120 74.750
RepVGG_B1_deploy 11.816G 51.829M 224x224 2 RTX A5000 256 0 cosinelr True False 120 77.834
RepVGG_B2_deploy 18.377G 80.315M 224x224 2 RTX A5000 256 0 cosinelr True False 120 78.226
RegNetX_400MF 410.266M 5.158M 224x224 2 RTX A5000 256 0 cosinelr True False 120 72.364
RegNetX_600MF 616.813M 6.196M 224x224 2 RTX A5000 256 0 cosinelr True False 120 73.598
RegNetX_800MF 820.324M 7.260M 224x224 2 RTX A5000 256 0 cosinelr True False 120 74.444
RegNetX_1_6GF 1.635G 9.190M 224x224 2 RTX A5000 256 0 cosinelr True False 120 76.580
RegNetX_3_2GF 3.222G 15.297M 224x224 2 RTX A5000 256 0 cosinelr True False 120 77.512
RegNetX_4_0GF 4.013G 22.118M 224x224 2 RTX A5000 256 0 cosinelr True False 120 77.722

You can find more model training details in classification_training/imagenet/.

CIFAR100 training results

Network macs params input size gpu num batch warm up lr decay apex syncbn epochs Top-1
ResNet18Cifar 557.935M 11.220M 32x32 1 RTX A5000 128 0 multistep True False 200 76.730
ResNet34halfCifar 292.370M 5.350M 32x32 1 RTX A5000 128 0 multistep True False 200 75.730
ResNet34Cifar 1.164G 21.328M 32x32 1 RTX A5000 128 0 multistep True False 200 77.850
ResNet50halfCifar 331.879M 5.991M 32x32 1 RTX A5000 128 0 multistep True False 200 75.880
ResNet50Cifar 1.312G 23.705M 32x32 1 RTX A5000 128 0 multistep True False 200 75.890
ResNet101Cifar 2.531G 42.697M 32x32 1 RTX A5000 128 0 multistep True False 200 79.710
ResNet152Cifar 3.751G 58.341M 32x32 1 RTX A5000 128 0 multistep True False 200 77.150

You can find more model training details in classification_training/cifar100/.

Detection training results

COCO2017 training results

Trained on COCO2017_train dataset, tested on COCO2017_val dataset.

mAP is IoU=0.5:0.95,area=all,maxDets=100,mAP(COCOeval,stats[0]).

















How to use yolov3 anchor clustering method to generate a set of custom anchors for your own dataset?

I provide a script in simpleAICV/detection/,and I give two examples for generate anchors on COCO2017 and VOC2007+2012 datasets.If you want to generate anchors for your dataset,just modify the part of input code,get width and height of all annotaion boxes,then use the script to compute anchors.The anchors size will change with different datasets or different input resizes.

Network resize-style input size macs params gpu num batch warm up lr decay apex syncbn epochs mAP
ResNet50-RetinaNet RetinaStyle-400 400x667 63.093G 37.969M 2 RTX A5000 32 0 multistep True False 13 31.939
ResNet50-RetinaNet RetinaStyle-800 800x1333 250.069G 37.969M 2 RTX A5000 8 0 multistep True False 13 35.082
ResNet50-RetinaNet YoloStyle-640 640x640 95.558G 37.969M 2 RTX A5000 32 0 multistep True False 13 33.475
ResNet101-RetinaNet RetinaStyle-800 800x1333 329.836G 56.961M 2 RTX A5000 8 0 multistep True False 13 36.406
ResNet50-FCOS RetinaStyle-400 400x667 54.066G 32.291M 2 RTX A5000 32 0 multistep True False 13 34.671
ResNet50-FCOS RetinaStyle-800 800x1333 214.406G 32.291M 2 RTX A5000 8 0 multistep True False 13 37.850
ResNet50-FCOS YoloStyle-640 640x640 81.943G 32.291M 2 RTX A5000 32 0 multistep True False 13 35.629
ResNet101-FCOS RetinaStyle-800 800x1333 294.173G 51.283M 2 RTX A5000 8 0 multistep True False 13 39.547
ResNet18DCN-CenterNet YoloStyle-512 512x512 14.854G 12.889M 2 RTX A5000 64 0 multistep True False 140 27.947
ResNet18DCN-TTFNet-3x YoloStyle-512 512x512 16.063G 13.737M 2 RTX A5000 64 0 multistep True False 39 27.847

You can find more model training details in detection_training/coco/.

VOC2007 and VOC2012 training results

Trained on VOC2007 trainval dataset + VOC2012 trainval dataset, tested on VOC2007 test dataset.

mAP is IoU=0.50,area=all,maxDets=100,mAP.

Network resize-style input size macs params gpu num batch warm up lr decay apex syncbn epochs mAP
ResNet50-RetinaNet YoloStyle-640 640x640 84.947G 36.724M 2 RTX A5000 32 0 multistep True False 13 80.693
ResNet50-FCOS YoloStyle-640 640x640 80.764G 32.153M 2 RTX A5000 32 0 multistep True False 13 79.960

You can find more model training details in detection_training/voc/.

Distillation training results

ImageNet training results

KD loss


DKD loss


DML loss


Teacher Network Student Network method Freeze Teacher input size gpu num batch warm up lr decay apex syncbn epochs Teacher Top-1 Student Top-1
ResNet34 ResNet18 CE+KD True 224x224 2 RTX A5000 256 0 multistep True False 100 / 71.984
ResNet34 ResNet18 CE+DKD True 224x224 2 RTX A5000 256 0 multistep True False 100 / 72.110
ResNet34 ResNet18 CE+DML False 224x224 2 RTX A5000 256 0 multistep True False 100 74.674 72.064
ResNet152 ResNet50 CE+KD True 224x224 2 RTX A5000 256 0 multistep True False 100 / 76.476
ResNet152 ResNet50 CE+DKD True 224x224 2 RTX A5000 256 0 multistep True False 100 / 77.616
ResNet152 ResNet50 CE+DML False 224x224 2 RTX A5000 256 0 multistep True False 100 79.148 77.622

You can find more model training details in distillation_training/imagenet/.


If you find my work useful in your research, please consider citing:
