mmdetection
mmdetection copied to clipboard
[Feature] Support RT-DETR
Motivation
Support RT-DETR as discussed in this issue
Referred to the following repositories for implementation details:
- lyuwenyu/RT-DETR
- PaddlePaddle/PaddleYOLO
- PaddlePaddle/PaddleDetection
- nijkah/mmdetection in PR #10498
Modification
-
Added support for RT-DETR with variants (r18vd, r34vd, r50vd, r101vd).
-
Added support for random sizes and interpolations in
BatchSyncRandomResize. -
Modified
ResNetV1dfor depth 18 and 34. -
Added a specialized varifocal loss,
RTDETRVarifocalLoss.
BC-breaking
When the depth is set to 18 or 34 in ResNetV1d, a downsample with conv_bn is now added to layer1.
Checklist
- [x] Pre-commit or other linting tools are used to fix the potential lint issues.
- [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
- [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMPreTrain.
- [x] The documentation has been modified accordingly, like docstring or example tutorials.
reproduction
all results trained on 1 gpu (V100) with total batch size 16
r18vd with amp
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.465
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.639
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.503
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.286
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.501
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.625
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.689
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.692
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.692
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.506
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.872
r18vd without amp
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.466
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.640
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.505
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.289
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.498
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.629
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.689
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.692
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.692
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.496
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.864
r50vd with amp
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.531
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.714
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.575
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.351
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.578
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.700
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.722
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.724
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.724
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.549
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.766
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.883
@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?
@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?
There are many differences in detail, and here are the ones I think are more important (r50vd arch for example):
| flytocc/rtdetr | nijkah/rtdetr |
|---|---|
norm_decay_mult=0 |
default 1 |
| BatchSyncRandomResize | RandomChoiceResize |
| MinIoURandomCrop | RandomCrop |
init eccoder with pytorch-like uniform |
Init HybridEncoder with mmcv-like normal |
-
The training (w. amp) AP of
r50vdarch fluctuates between52.9and53.1. -
Random interpolations has almost no effect on AP.
- The training AP of
r50vdarch fluctuates between52.9and53.1.- Random interpolations has almost no effect on AP.
您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch
- The training AP of
r50vdarch fluctuates between52.9and53.1.- Random interpolations has almost no effect on AP.
您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch
目前只测试过COCO数据集。你试着可以检查一下数据增强
- The training AP of
r50vdarch fluctuates between52.9and53.1.- Random interpolations has almost no effect on AP.
您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch
目前只测试过COCO数据集。你试着可以检查一下数据增强
我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题
- The training AP of
r50vdarch fluctuates between52.9and53.1.- Random interpolations has almost no effect on AP.
您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch
目前只测试过COCO数据集。你试着可以检查一下数据增强
我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题
打错了,降低学习率或者增大batch还是会存在这个问题
@ychensu 要不你到 flytocc/mmdetection 提一个issue
Is this currently blocked?
