mmrotate HRSC2016 dataset performance reimplementation

您好！我使用您的框架mmrotate0.2.0训练测试HRSC2016数据集的性能，分别使用了ReDet和Roi trans等模型，发现测试集上的性能与使用ReDet论文作者发布的代码差距很大，特别是AP75和mAP。而配置都是一样的，基本没改过。想寻求您的帮助，不知您是否在HRSC上做过实验验证？

dataset_type = 'HRSCDataset'
data_root = '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RResize', img_scale=(800, 512)),
    dict(
        type='RRandomFlip',
        flip_ratio=[0.25, 0.25, 0.25],
        direction=['horizontal', 'vertical', 'diagonal'],
        version='le90'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(800, 512),
        flip=False,
        transforms=[
            dict(type='RResize'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='HRSCDataset',
        classwise=False,
        ann_file=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/hrsc/train/trainset.txt',
        ann_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/Annotations/',
        img_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/AllImages/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(type='RResize', img_scale=(800, 512)),
            dict(
                type='RRandomFlip',
                flip_ratio=[0.25, 0.25, 0.25],
                direction=['horizontal', 'vertical', 'diagonal'],
                version='le90'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
        ],
        version='le90'),
    val=dict(
        type='HRSCDataset',
        classwise=False,
        ann_file=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/hrsc/test/testset.txt',
        ann_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/Annotations/',
        img_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/AllImages/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(800, 512),
                flip=False,
                transforms=[
                    dict(type='RResize'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='DefaultFormatBundle'),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        version='le90'),
    test=dict(
        type='HRSCDataset',
        classwise=False,
        ann_file=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/hrsc/test/testset.txt',
        ann_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/Annotations/',
        img_subdir=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/data/datasets/HRSC2016/HRSC2016/FullDataSet/AllImages/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(800, 512),
                flip=False,
                transforms=[
                    dict(type='RResize'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='DefaultFormatBundle'),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        version='le90'))
evaluation = dict(interval=36, metric='mAP')
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.3333333333333333,
    step=[24, 33])
runner = dict(type='EpochBasedRunner', max_epochs=36)
checkpoint_config = dict(interval=12)
log_config = dict(
    interval=50,
    hooks=[dict(type='TextLoggerHook'),
           dict(type='TensorboardLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
angle_version = 'le90'
model = dict(
    type='ReDet',
    backbone=dict(
        type='ReResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        style='pytorch',
        pretrained=
        '/home/zhangc/Zhangc/oriented_det/mmrotate-020/work_dirs/ReResNet_pretrain/re_resnet50_c8_batch256-25b16846.pth'
    ),
    neck=dict(
        type='ReFPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RotatedRPNHead',
        in_channels=256,
        feat_channels=256,
        version='le90',
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(
            type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
    roi_head=dict(
        type='RoITransRoIHead',
        version='le90',
        num_stages=2,
        stage_loss_weights=[1, 1],
        bbox_roi_extractor=[
            dict(
                type='SingleRoIExtractor',
                roi_layer=dict(
                    type='RoIAlign', output_size=7, sampling_ratio=0),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            dict(
                type='RotatedSingleRoIExtractor',
                roi_layer=dict(
                    type='RiRoIAlignRotated',
                    out_size=7,
                    num_samples=2,
                    num_orientations=8,
                    clockwise=True),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32])
        ],
        bbox_head=[
            dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=1,
                bbox_coder=dict(
                    type='DeltaXYWHAHBBoxCoder',
                    angle_range='le90',
                    norm_factor=2,
                    edge_swap=True,
                    target_means=[0.0, 0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.1, 0.1, 0.2, 0.2, 1]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
                               loss_weight=1.0)),
            dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=1,
                bbox_coder=dict(
                    type='DeltaXYWHAOBBoxCoder',
                    angle_range='le90',
                    norm_factor=None,
                    edge_swap=True,
                    proj_xy=True,
                    target_means=[0.0, 0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.05, 0.05, 0.1, 0.1, 0.5]),
                reg_class_agnostic=False,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
        ]),
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=256,
                pos_fraction=0.5,
                neg_pos_ub=-1,
                add_gt_as_proposals=False),
            allowed_border=0,
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=[
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='BboxOverlaps2D')),
                sampler=dict(
                    type='RandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False),
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='RBboxOverlaps2D')),
                sampler=dict(
                    type='RRandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)
        ]),
    test_cfg=dict(
        rpn=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=dict(
            nms_pre=2000,
            min_bbox_size=0,
            score_thr=0.05,
            nms=dict(iou_thr=0.1),
            max_per_img=2000)))
work_dir = 'work_dirs/redet/redet_re50_refpn_3x_hrsc_le90'
auto_resume = False
gpu_ids = range(0, 1)

Apr 11 '22 02:04 zhangcongzc

Please use English or English & Chinese for issues so that we could have broader discussion.

Apr 11 '22 02:04 mm-assistant[bot]

Have you ever used the official ReDet to do related experiments?

Apr 11 '22 02:04 yangxue0827

Whether more data augmentation, multi-scale, etc. are used in the paper.

Apr 11 '22 02:04 yangxue0827

It is recommended to use the official code to run an experiment with the same configuration as mmrotate. I speculate that the related settings of HRSC2016 in the original paper are not fully described.

Apr 11 '22 02:04 yangxue0827

Whether more data augmentation, multi-scale, etc. are used in the paper. Thank you for your reply! I have done related experiments in official code about ReDet, it basically can reproduce the performance in paper. And I don't use tricks to improve the performance in ReDet such as multi-scale, rotate augmentation.

Apr 11 '22 02:04 zhangcongzc

Here is a different:

official redet: lr=0.01
mmrotate redet: lr=0.0025

Apr 11 '22 03:04 yangxue0827

How many graphics cards did you use to run the official redet?

Apr 11 '22 03:04 yangxue0827

target_stds between them is also different: -official redet: target_stds=[0.1, 0.1, 0.2, 0.2, 0.1] target_stds=[0.05, 0.05, 0.1, 0.1, 0.05] -mmrotate redet: lr=0.0025 target_stds=[0.1, 0.1, 0.2, 0.2, 1] target_stds=[0.05, 0.05, 0.1, 0.1, 0.5]

Please compare it carefully, and I will also do relevant experiments to follow up.

Apr 11 '22 03:04 yangxue0827

There is also a difference in dealing with difficult cases. ReDet will ignore difficult cases in evaluation.

Apr 11 '22 04:04 jbwang1997

This is my configs:

_base_ = [
    '../_base_/datasets/hrsc.py', '../_base_/schedules/schedule_3x.py',
    '../_base_/default_runtime.py'
]

angle_version = 'le90'
model = dict(
    type='ReDet',
    backbone=dict(
        type='ReResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        style='pytorch',
        pretrained='./work_dirs/re_resnet50_c8_batch256-25b16846.pth'),
    neck=dict(
        type='ReFPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RotatedRPNHead',
        in_channels=256,
        feat_channels=256,
        version=angle_version,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[.0, .0, .0, .0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
    roi_head=dict(
        type='RoITransRoIHead',
        version=angle_version,
        num_stages=2,
        stage_loss_weights=[1, 1],
        bbox_roi_extractor=[
            dict(
                type='SingleRoIExtractor',
                roi_layer=dict(
                    type='RoIAlign', output_size=7, sampling_ratio=0),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            dict(
                type='RotatedSingleRoIExtractor',
                roi_layer=dict(
                    type='RiRoIAlignRotated',
                    out_size=7,
                    num_samples=2,
                    num_orientations=8,
                    clockwise=True),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
        ],
        bbox_head=[
            dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=1,
                bbox_coder=dict(
                    type='DeltaXYWHAHBBoxCoder',
                    angle_range=angle_version,
                    norm_factor=2,
                    edge_swap=True,
                    target_means=[0., 0., 0., 0., 0.],
                    target_stds=[0.1, 0.1, 0.2, 0.2, 0.1]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
                               loss_weight=1.0)),
            dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=1,
                bbox_coder=dict(
                    type='DeltaXYWHAOBBoxCoder',
                    angle_range=angle_version,
                    norm_factor=None,
                    edge_swap=True,
                    proj_xy=True,
                    target_means=[0., 0., 0., 0., 0.],
                    target_stds=[0.05, 0.05, 0.1, 0.1, 0.05]),
                reg_class_agnostic=False,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
        ]),
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=256,
                pos_fraction=0.5,
                neg_pos_ub=-1,
                add_gt_as_proposals=False),
            allowed_border=0,
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=[
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='BboxOverlaps2D')),
                sampler=dict(
                    type='RandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False),
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='RBboxOverlaps2D')),
                sampler=dict(
                    type='RRandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)
        ]),
    test_cfg=dict(
        rpn=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=dict(
            nms_pre=2000,
            min_bbox_size=0,
            score_thr=0.05,
            nms=dict(iou_thr=0.1),
            max_per_img=2000)))

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RResize', img_scale=(800, 512)),
    dict(type='RRandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(800, 512),
        flip=False,
        transforms=[
            dict(type='RResize'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img'])
        ])
]

dataset_type = 'HRSCDataset'
data_root = '/data/dataset_share/HRSC2016/HRSC2016/'
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        classwise=False,
        ann_file=data_root + 'ImageSets/trainval.txt',
        ann_subdir=data_root + 'FullDataSet/Annotations/',
        img_subdir=data_root + 'FullDataSet/AllImages/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        classwise=False,
        ann_file=data_root + 'ImageSets/test.txt',
        ann_subdir=data_root + 'FullDataSet/Annotations/',
        img_subdir=data_root + 'FullDataSet/AllImages/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        classwise=False,
        ann_file=data_root + 'ImageSets/test.txt',
        ann_subdir=data_root + 'FullDataSet/Annotations/',
        img_subdir=data_root + 'FullDataSet/AllImages/',
        pipeline=test_pipeline))

evaluation = dict(interval=12, metric='mAP')
optimizer = dict(lr=0.01)

performance at 12 epoch:

---------------iou_thr: 0.5---------------
2022-04-11 12:05:50,553 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.967  | 0.891 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.891 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.55---------------
2022-04-11 12:05:59,825 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.961  | 0.890 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.890 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.6---------------
2022-04-11 12:06:09,510 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.954  | 0.890 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.890 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.65---------------
2022-04-11 12:06:18,999 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.937  | 0.887 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.887 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.7---------------
2022-04-11 12:06:28,428 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.916  | 0.880 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.880 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.75---------------
2022-04-11 12:06:37,823 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.880  | 0.798 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.798 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.8---------------
2022-04-11 12:06:47,119 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.807  | 0.757 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.757 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.85---------------
2022-04-11 12:06:56,765 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.625  | 0.528 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.528 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.9---------------
2022-04-11 12:07:05,898 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.289  | 0.178 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.178 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.95---------------
2022-04-11 12:07:15,514 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1764 | 0.027  | 0.006 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.006 |
+-------+------+------+--------+-------+

Apr 11 '22 04:04 yangxue0827

performance at 24 epoch

---------------iou_thr: 0.5---------------
2022-04-11 12:38:49,661 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.970  | 0.904 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.904 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.55---------------
2022-04-11 12:38:58,824 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.968  | 0.904 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.904 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.6---------------
2022-04-11 12:39:08,249 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.964  | 0.903 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.903 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.65---------------
2022-04-11 12:39:17,661 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.953  | 0.901 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.901 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.7---------------
2022-04-11 12:39:27,196 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.943  | 0.899 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.899 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.75---------------
2022-04-11 12:39:36,362 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.907  | 0.883 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.883 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.8---------------
2022-04-11 12:39:45,558 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.848  | 0.784 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.784 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.85---------------
2022-04-11 12:39:55,024 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.717  | 0.642 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.642 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.9---------------
2022-04-11 12:40:04,639 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.389  | 0.244 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.244 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.95---------------
2022-04-11 12:40:14,024 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1346 | 0.056  | 0.030 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.030 |
+-------+------+------+--------+-------+

Apr 11 '22 04:04 yangxue0827

Did you use four graphics cards to run mmrotate? If so, that's right to set the learning rate to 0.01. I only used one graphics card , so I set the learning rate to 0.0025 in both two codes.（8 <=> 0.01 2<=> 0.0025）.

I find that there are differences in target means&std, but I didn't think it can influence the performance sharply. Now your performance is better than me in 12&24 epoch. I will do the same experiments to verify it. Thanks for your help!

Apr 11 '22 04:04 zhangcongzc

I am using a single gpu, since there are no free four gpus to use. I think the main reason is lr.

Apr 11 '22 05:04 yangxue0827

performance at 36 epoch:

AP50 = 90.4 AP75 = 89.5 MAP = 72.3

---------------iou_thr: 0.5---------------
2022-04-11 13:11:30,212 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.977  | 0.904 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.904 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.55---------------
2022-04-11 13:11:39,697 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.976  | 0.904 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.904 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.6---------------
2022-04-11 13:11:49,044 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.971  | 0.903 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.903 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.65---------------
2022-04-11 13:11:58,460 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.959  | 0.901 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.901 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.7---------------
2022-04-11 13:12:07,693 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.954  | 0.901 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.901 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.75---------------
2022-04-11 13:12:17,115 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.928  | 0.895 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.895 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.8---------------
2022-04-11 13:12:26,512 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.870  | 0.795 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.795 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.85---------------
2022-04-11 13:12:35,612 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.757  | 0.677 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.677 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.9---------------
2022-04-11 13:12:45,282 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.485  | 0.328 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.328 |
+-------+------+------+--------+-------+

---------------iou_thr: 0.95---------------
2022-04-11 13:12:54,782 - mmrotate - INFO -
+-------+------+------+--------+-------+
| class | gts  | dets | recall | ap    |
+-------+------+------+--------+-------+
| ship  | 1228 | 1350 | 0.089  | 0.023 |
+-------+------+------+--------+-------+
| mAP   |      |      |        | 0.023 |
+-------+------+------+--------+-------+

Apr 11 '22 05:04 yangxue0827

here is 12 epoch performance with

1GPU bs=2 lr=0.0025
target_stds=[0.1, 0.1, 0.2, 0.2, 1] target_stds=[0.05, 0.05, 0.1, 0.1, 0.5]

mAP: 0.4113, AP50: 0.8570, AP55: 0.7650, AP60: 0.7350, AP65: 0.6320, AP70: 0.5030, AP75: 0.3270, AP80: 0.2100, AP85: 0.0660, AP90: 0.0180, AP95: 0.0000

AP is much lower than above.

Apr 11 '22 05:04 liuyanyi

PR https://github.com/open-mmlab/mmrotate/pull/203 is in progress. @zhangcongzc

Apr 11 '22 05:04 yangxue0827

performance of retinanet and kld are also updated at https://github.com/open-mmlab/mmrotate/blob/e79c15dea21424cd5216ecb9244f17534d13d971/configs/kld/README.md

Backbone	mAP	AP50	AP75	Angle	lr schd	Mem (GB)	Inf Time (fps)	Aug	Batch Size	Configs
ResNet50 (800,512)	52.06	84.80	58.10	le90	6x	1.56	38.2	RR	2	rotated_retinanet_obb_r50_fpn_6x_hrsc_rr_le90
ResNet50 (800,512)	54.15	86.20	60.60	le90	6x	1.56	38.2	RR	2	rotated_retinanet_obb_kld_stable_r50_fpn_6x_hrsc_rr_le90
ResNet50 (800,512)	45.09	79.30	46.90	oc	6x	1.56	39.2	RR	2	rotated_retinanet_hbb_r50_fpn_6x_hrsc_rr_oc
ResNet50 (800,512)	58.19	86.20	69.80	oc	6x	1.56	39.5	RR	2	rotated_retinanet_hbb_kld_stable_r50_fpn_6x_hrsc_rr_oc

refer to https://github.com/csuhan/s2anet/blob/master/configs/hrsc2016/retinanet_obb_r50_fpn_6x_hrsc2016.py

Apr 12 '22 13:04 yangxue0827

Thanks for your relevant experiments! And my experiments show that the reason for the performance degradation of the two models is target stds .

Apr 17 '22 05:04 zhangcongzc

This problem may also occur on datasets such as SSDD. This bug involves faster rcnn, redet, roi trans.. If this problem is fixed, we may need to retrain DOTA related experiments to update the weights and logs, which is a lot of work.

Apr 17 '22 06:04 yangxue0827

Besides, I have some new questions about KLD that needs your help. Is you used in your paper Retinanet_hbb or Retinanet _obb? I can't reproduce your performance in retinanet+kld either paper or mmrotate. And I used the new config you provided, except for"delete", it will cause code errors.

Apr 17 '22 06:04 zhangcongzc

evaluation = dict(interval=36, metric='mAP')
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.3333333333333333,
    step=[48, 66])
runner = dict(type='EpochBasedRunner', max_epochs=72)
checkpoint_config = dict(interval=36)
log_config = dict(
    interval=50,
    hooks=[dict(type='TextLoggerHook'),
           dict(type='TensorboardLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
angle_version = 'le90'
model = dict(
    type='RotatedRetinaNet',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        zero_init_residual=False,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        start_level=1,
        add_extra_convs='on_input',
        num_outs=5),
    bbox_head=dict(
        type='RotatedRetinaHead',
        num_classes=1,
        in_channels=256,
        stacked_convs=4,
        feat_channels=256,
        assign_by_circumhbbox='le90',
        anchor_generator=dict(
            type='RotatedAnchorGenerator',
            octave_base_scale=4,
            scales_per_octave=3,
            ratios=[1.0, 0.5, 2.0],
            strides=[8, 16, 32, 64, 128]),
        bbox_coder=dict(
            type='DeltaXYWHAOBBoxCoder',
            angle_range='le90',
            norm_factor=None,
            edge_swap=True,
            proj_xy=True,
            target_means=(0.0, 0.0, 0.0, 0.0, 0.0),
            target_stds=(1.0, 1.0, 1.0, 1.0, 1.0)),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=1.0),
        reg_decoded_bbox=True,
        loss_bbox=dict(
            type='GDLoss',
            loss_type='kld',
            fun='log1p',
            tau=1,
            sqrt=False,
            loss_weight=1.0)),
    train_cfg=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.4,
            min_pos_iou=0,
            ignore_iof_thr=-1,
            iou_calculator=dict(type='RBboxOverlaps2D')),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    test_cfg=dict(
        nms_pre=2000,
        min_bbox_size=0,
        score_thr=0.05,
        nms=dict(iou_thr=0.1),
        max_per_img=2000))
work_dir = 'work_dirs/rotated_retinanet/rotated_retinanet_hbb_r50_fpn_6x_hrsc_le90_kld'
auto_resume = False
gpu_ids = range(0, 1)

Apr 17 '22 06:04 zhangcongzc

Logs and weights are updated, you can check it.

HRSC

Backbone	mAP	AP50	AP75	Angle	lr schd	Mem (GB)	Inf Time (fps)	Aug	Batch Size	Configs	Download
ResNet50 (800,512)	52.06	84.80	58.10	le90	6x	1.56	38.2	RR	2	rotated_retinanet_obb_r50_fpn_6x_hrsc_rr_le90	model \| log
ResNet50 (800,512)	54.15	86.20	60.60	le90	6x	1.56	38.2	RR	2	rotated_retinanet_obb_kld_stable_r50_fpn_6x_hrsc_rr_le90	model \| log
ResNet50 (800,512)	45.09	79.30	46.90	oc	6x	1.56	39.2	RR	2	rotated_retinanet_hbb_r50_fpn_6x_hrsc_rr_oc	model \| log
ResNet50 (800,512)	58.17	87.00	69.30	oc	6x	1.56	39.5	RR	2	rotated_retinanet_hbb_kld_stable_r50_fpn_6x_hrsc_rr_oc	model \| log

Apr 17 '22 06:04 yangxue0827

Your config is not the same as mine. RetinaNet-hbb-oc: assign_by_circumhbbox='oc'

model = dict(
    bbox_head=dict(
        reg_decoded_bbox=True,
        loss_bbox=dict(
            _delete_=True,
            type='GDLoss',
            loss_type='kld',
            fun='log1p',
            tau=1,
            sqrt=False,
            loss_weight=5.5)))

RetinaNet-hbb-le90: assign_by_circumhbbox=None

model = dict(
    bbox_head=dict(
        reg_decoded_bbox=True,
        loss_bbox=dict(
            _delete_=True,
            type='GDLoss',
            loss_type='kld',
            fun='log1p',
            tau=1,
            sqrt=False,
            loss_weight=1.0)))

I suggest you to copy my configs to experiment. You need copy rotated_retinanet/rotated_retinanet_obb_r50_fpn_6x_hrsc_rr_le90.py first, then the error caused by _delete_ will disappear.

Apr 17 '22 06:04 yangxue0827

Sorry, I can't open the configs' links. And I can't understand why RetinaNet-hbb-le90 set "assign_by_circumhbbox" to "None". Shouldn't this parameter be set to angle_version in hbb?

Apr 17 '22 07:04 zhangcongzc

Refer to https://github.com/open-mmlab/mmrotate/pull/183

Apr 17 '22 07:04 yangxue0827

rotated_retinanet_obb_kld_stable_r50_fpn_6x_hrsc_rr_le90.txt rotated_retinanet_hbb_kld_stable_r50_fpn_6x_hrsc_rr_oc.txt

Apr 17 '22 07:04 yangxue0827

Sorry, may I didn't make it clear. I do experiments in Retinanet obb and hbb about le90, so I set "assign_by_circumhbbox" to "len90" in second experiments (RetinaNet-hbb-le90). I think this parameter be set to angle_version in hbb. Is there a mistake here? I can't understand why you set it to "None".

Apr 17 '22 07:04 zhangcongzc

this is our base setting after extensive experimentation. There is no particular reason.

Apr 17 '22 07:04 yangxue0827

We use hbb under oc and obb under le.

Apr 17 '22 07:04 yangxue0827

Ok, I get it. Thanks for your help.

Apr 17 '22 08:04 zhangcongzc

Besides, I have some new questions about KLD that needs your help. Is you used in your paper Retinanet_hbb or Retinanet _obb? I can't reproduce your performance in retinanet+kld either paper or mmrotate. And I used the new config you provided, except for"delete", it will cause code errors. Could you kindly provide the config of your gwd reimplementation or other HRSC results?