mmdetection icon indicating copy to clipboard operation
mmdetection copied to clipboard

[Feature] Support RT-DETR

Open flytocc opened this issue 1 year ago • 12 comments
trafficstars

Motivation

Support RT-DETR as discussed in this issue

Referred to the following repositories for implementation details:

Modification

  1. Added support for RT-DETR with variants (r18vd, r34vd, r50vd, r101vd).

  2. Added support for random sizes and interpolations in BatchSyncRandomResize.

  3. Modified ResNetV1d for depth 18 and 34.

  4. Added a specialized varifocal loss, RTDETRVarifocalLoss.

BC-breaking

When the depth is set to 18 or 34 in ResNetV1d, a downsample with conv_bn is now added to layer1.

Checklist

  • [x] Pre-commit or other linting tools are used to fix the potential lint issues.
  • [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • [x] If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMPreTrain.
  • [x] The documentation has been modified accordingly, like docstring or example tutorials.

flytocc avatar Jan 17 '24 13:01 flytocc

reproduction

all results trained on 1 gpu (V100) with total batch size 16

r18vd with amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.465
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.639
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.503
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.286
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.501
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.625
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.872

r18vd without amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.466
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.640
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.505
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.289
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.498
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.629
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.496
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.864

r50vd with amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.714
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.575
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.351
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.578
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.700
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.722
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.724
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.724
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.549
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.766
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.883

flytocc avatar Jan 17 '24 13:01 flytocc

@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?

hhaAndroid avatar Jan 18 '24 03:01 hhaAndroid

@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?

There are many differences in detail, and here are the ones I think are more important (r50vd arch for example):

flytocc/rtdetr nijkah/rtdetr
norm_decay_mult=0 default 1
BatchSyncRandomResize RandomChoiceResize
MinIoURandomCrop RandomCrop
init eccoder with pytorch-like uniform Init HybridEncoder with mmcv-like normal

flytocc avatar Jan 18 '24 05:01 flytocc

  • The training (w. amp) AP of r50vd arch fluctuates between 52.9 and 53.1.

  • Random interpolations has almost no effect on AP.

flytocc avatar Feb 05 '24 02:02 flytocc

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

ychensu avatar Feb 20 '24 03:02 ychensu

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

flytocc avatar Feb 20 '24 04:02 flytocc

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题 image

ychensu avatar Feb 20 '24 07:02 ychensu

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题 image

打错了,降低学习率或者增大batch还是会存在这个问题

ychensu avatar Feb 20 '24 07:02 ychensu

@ychensu 要不你到 flytocc/mmdetection 提一个issue

flytocc avatar Feb 20 '24 08:02 flytocc

Is this currently blocked?

mmeendez8 avatar May 03 '24 13:05 mmeendez8