mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

Inconsistent behavior of the test script in mmdetection vs mmdeploy

Open ernestlwt opened this issue 3 years ago • 4 comments

Thanks for your bug report. We appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug

A clear and concise description of what the bug is.

Inconsistent behavior of the test script in mmdetection vs mmdeploy from the same config file

config file used

...
classes=  ('person',) # variable assigned
data = dict(
    train=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'train.json'),
    val=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'),
    test=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'))
...

The script at mmdetection/tools/test.py does the evaluation without consideration of the classes variable, however the script at mmdeploy/tools/test.py will filter the classes accordingly during evaluation even when the variable has not been passed into the data.train dict resulting in inconsistent test results from the 2 scripts

Reproduction

  1. What command or script did you run?

mmdetection/tools/test.py and mmdeploy/tools/test.py

  1. Did you make any modifications on the code or config? Did you understand what you have modified?

config file modified as shown above

Environment

  1. Please run python tools/check_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)
2022-05-24 01:54:53,577 - mmdeploy - INFO - TorchVision: 0.9.0
2022-05-24 01:54:53,577 - mmdeploy - INFO - OpenCV: 4.5.5
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV: 1.4.0
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMCV CUDA Compiler: 10.2
2022-05-24 01:54:53,577 - mmdeploy - INFO - MMDeploy: 0.4.0+3786856
2022-05-24 01:54:53,578 - mmdeploy - INFO - 

2022-05-24 01:54:53,578 - mmdeploy - INFO - **********Backend information**********
[2022-05-24 01:54:54.020] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
2022-05-24 01:54:54,133 - mmdeploy - INFO - onnxruntime: 1.8.1  ops_is_avaliable : True
2022-05-24 01:54:54,135 - mmdeploy - INFO - tensorrt: 8.2.4.2   ops_is_avaliable : True
2022-05-24 01:54:54,136 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
2022-05-24 01:54:54,137 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-05-24 01:54:54,138 - mmdeploy - INFO - openvino_is_avaliable: False
2022-05-24 01:54:54,138 - mmdeploy - INFO - 

2022-05-24 01:54:54,138 - mmdeploy - INFO - **********Codebase information**********
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmdet:      2.24.1
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmseg:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmcls:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmocr:      None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmedit:     None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmdet3d:    None
2022-05-24 01:54:54,140 - mmdeploy - INFO - mmpose:     None

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

ernestlwt avatar May 24 '22 03:05 ernestlwt

ok, let me try to make a minimal preproduction.

tpoisonooo avatar May 24 '22 15:05 tpoisonooo

I probably know what the problem is:

  • build_object_detection_model() needs get_classes_from_config()
  • The model from model = task_processor.init_backend_model(args.model) may be incorrect.

Can you give me $MODEL_CONFIG and $MODEL_PATH ? I have tried atss and everything is fine.

tpoisonooo avatar May 25 '22 15:05 tpoisonooo

This is the configuration file i am using and i am retraining on the model from the mmdetection tutorial faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth

_base_ = [
    '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
]
# We also need to change the num_classes in head to match the dataset's annotation
# model = dict(
#     roi_head=dict(
#         bbox_head=dict(
#             type='Shared2FCBBoxHead',
#             num_classes=1,))
# )
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
image_size = (1333, 800)
train_pipeline = [
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='Resize', # not this
        img_scale=image_size,
        ratio_range=(0.1, 2.0),
        multiscale_mode='range',
        keep_ratio=True),
    dict(
        type='RandomCrop', # not this
        crop_type='absolute_range',
        crop_size=image_size,
        recompute_bbox=True,
        allow_negative_crop=True),
    dict(type='FilterAnnotations', min_gt_bbox_wh=(1e-2, 1e-2)), # not this
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size=image_size), 
    # dict(type='Pad', size=32),  
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]

# Modify dataset related settings
dataset_type = 'COCODataset'
# data_root = 'data/coco_mini/'
data_root = '/root/workspace/mmdetection/data/coco_mini/'
# classes=  ('person',)
data = dict(
    train=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'train.json'),
    val=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'),
    test=dict(
        img_prefix=data_root + 'images/',
        # classes=classes,
        ann_file=data_root + 'val.json'))

# We can use the pre-trained Mask RCNN model to obtain higher performance
load_from = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
evaluation = dict(interval=1, metric='bbox', save_best='bbox_mAP_50', classwise=True)
runner = dict(_delete_=True, type='EpochBasedRunner', max_epochs=12)

ernestlwt avatar May 30 '22 01:05 ernestlwt

Thanks for your config file, I think I have reproduce the case.

As listed before, this sniplet:

  • first attempts to find user input
  • if fails, gets it from the model configuration
  • if fails again, generate the sequence by len(CLASSES)

In mmdet customize dataset tutorial, user need to specify CLASSES.

So I think mmdeploy meets the original needs of model trainer: prioritize user input, mmdeploy should not fix it.

Let me open a mmdet issue to talk about this inconsistence.

tpoisonooo avatar May 30 '22 13:05 tpoisonooo

close issue for no reply, If you have further discussion, please reopen it.

tpoisonooo avatar Aug 29 '22 09:08 tpoisonooo