LGD
LGD copied to clipboard
Performance down in some other detectors
Hi, thanks for greate work. Have you ever tried LGD in advanced detector, such as TOOD or DDOD? I reimplement LGD in MMDetection and insert LGD to TOOD and DDOD. But it results lower performance than baseline(DDOD mAP down from 41.7 to 38.7 and TOOD mAP down from 42.3 to 38.7) in R50-FPN 1xss setting. BTW, the code in your repo also contain ATSS Detector, have you tried ATSS with LGD? I didn't see this experiment in your paper. It would be appreciate if you can provide more experiment info :)
And I try the ATSSCT distillator in your repo with the cfg files below. The result shows that LGD improve the mAP to 39.89 compared with the baseline 39.42 in 1xss R50 setting. The baseline mAP seems normal. But the perfomance up from LGD seems not so good. Is the result as expected? I want try some improve work on LGD, and this result is very important for me to do some follow-up work. It would be appreciated if you can provide more experiment info. Thanks! @zhangpzh Here are cfg files: atss_R_50_1xSS_prD30K_prS10K_bs16.yaml
_BASE_: "../../oss_baseline/Base-RetinaNet_1xss_bs16.yaml"
MODEL:
META_ARCHITECTURE: 'DistillatorATSS'
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
MASK_ON: False
RESNETS:
DEPTH: 50
DISTILLATOR:
TEACHER:
META_ARCH: 'DynamicTeacher'
SOLVER:
OPTIMIZER: 'SGD'
BASE_LR: 0.01
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-4
LR_SCHEDULER_NAME: "WarmupMultiStepLR"
STEPS: (60000, 80000)
GAMMA: 0.1
WARMUP_FACTOR: 1e-3
WARMUP_ITERS: 1e03
WARMUP_METHOD: "linear"
INTERACT_PATTERN: 'stuGuided'
DETACH_APPEARANCE_EMBED: False
ADD_CONTEXT_BOX: True
STUDENT:
META_ARCH: 'ATSSCT'
SOLVER:
OPTIMIZER: 'SGD'
BASE_LR: 0.01
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-4
LR_SCHEDULER_NAME: "WarmupMultiStepLR"
STEPS: (60000, 80000)
GAMMA: 0.1
WARMUP_FACTOR: 1e-3
WARMUP_ITERS: 1e03
WARMUP_METHOD: "linear"
ADAPTER:
META_ARCH: 'SequentialConvs'
PRE_NONDISTILL_ITERS: 30000
POST_NONDISTILL_ITERS: 0
PRE_FREEZE_STUDENT_BACKBONE_ITERS: 10000
LAMBDA: 1.0
EVAL_TEACHER: True
INPUT:
MIN_SIZE_TRAIN: (800,)
SOLVER:
STEPS: (60000, 80000)
MAX_ITER: 90000
# OUTPUT_DIR: 'outputs/RetinaNet/retinanet_R_50_1xSS_stuGuided_addCtxBox=YES_detachAppearanceEmbed=NO_preNondistillIters=30k_preFreezeStudentBackboneIters=10k/'
Base-RetinaNet_1xss_bs16.yaml
_BASE_: "./bs32_schedule1x.yaml"
MODEL:
META_ARCHITECTURE: "RetinaNet"
# TODO weight and deepth
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
BACKBONE:
NAME: "build_retinanet_resnet_fpn_backbone"
RESNETS:
# NORM: "SyncBN"
OUT_FEATURES: ["res3", "res4", "res5"]
ANCHOR_GENERATOR:
SIZES: !!python/object/apply:eval ["[[x, x * 2**(1.0/3), x * 2**(2.0/3) ] for x in [32, 64, 128, 256, 512 ]]"]
FPN:
# NORM: "SyncBN"
IN_FEATURES: ["res3", "res4", "res5"]
RETINANET:
IOU_THRESHOLDS: [0.4, 0.5]
IOU_LABELS: [0, -1, 1]
SMOOTH_L1_LOSS_BETA: 0.0
DATASETS:
TRAIN: ("coco_2017_train_oss",)
TEST: ("coco_2017_val_oss",)
SOLVER:
IMS_PER_BATCH: 16
BASE_LR: 0.01 # Note that RetinaNet uses a different default learning rate
STEPS: (60000, 80000)
MAX_ITER: 90000
CLIP_GRADIENTS: {"ENABLED": True}
CHECKPOINT_PERIOD: 10000
# warmup
WARMUP_FACTOR: 1e-3
WARMUP_ITERS: 1000
WARMUP_METHOD: "linear"
INPUT:
MIN_SIZE_TRAIN: (800,)
VERSION: 2
TEST:
EVAL_PERIOD: 10000
OSS_PREFIX: '/data/oss_bucket_0/'
# OUTPUT_DIR: '' # specified by jobname in mdl args