mmdeploy [Bug]When I deploy segmentation, the result of SDK occurs discontinuous results. Sometimes the result is normal, but another time it's unnormal.

[Bug]When I deploy segmentation, the result of SDK occurs discontinuous results. Sometimes the result is normal, but another time it's unnormal.

Open GSusan opened this issue 10 months ago • 0 comments

Checklist

[X] I have searched related issues but cannot get the expected help.
[ ] 2. I have read the FAQ documentation but cannot get the expected help.
[ ] 3. The bug has not been fixed in the latest version.

Describe the bug

I trained upernet model with mmsegmentation, then I use mmdeploy to export onnx. When I test it with image_segmentation.cpp, I found something unnormal. For the same image, sometimes the result is normal,but sometimes it occurs discontinuous pixels. I have tried to compile mmdeploy again, but it doesn't work. So anyone encountered this problem? Thanks! 1713492119943 1713492152309 There are two results that I described.

Reproduction

After deploy, I run the example of image_segmentation.cpp. And export mask of result.

Environment

(1)mmsegmentation='0.30.0'  mmdeploy=0.13.0  onnxruntime=1.8.1 
with mmdeploy and onnxruntime, I have already tried it with mmdet, all results are normal.
(2)model_cfg is as follow:
norm_cfg = dict(type='SyncBN', requires_grad=True)
backbone_norm_cfg = dict(type='LN', requires_grad=True)
model = dict(
    type='EncoderDecoder',
    pretrained=None,
    backbone=dict(
        type='SwinTransformer',
        pretrain_img_size=224,
        embed_dims=96,
        patch_size=4,
        window_size=7,
        mlp_ratio=4,
        depths=[2, 2, 18, 2],
        num_heads=[3, 6, 12, 24],
        strides=(4, 2, 2, 2),
        out_indices=(0, 1, 2, 3),
        qkv_bias=True,
        qk_scale=None,
        patch_norm=True,
        drop_rate=0.0,
        attn_drop_rate=0.0,
        drop_path_rate=0.3,
        use_abs_pos_embed=False,
        act_cfg=dict(type='GELU'),
        norm_cfg=dict(type='LN', requires_grad=True),
        init_cfg=dict(
            type='Pretrained',
            checkpoint=
            'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/swin/swin_small_patch4_window7_224_20220317-7ba6d6dd.pth'
        )),
    decode_head=dict(
        type='UPerHead',
        in_channels=[96, 192, 384, 768],
        in_index=[0, 1, 2, 3],
        pool_scales=(1, 2, 3, 6),
        channels=512,
        dropout_ratio=0.1,
        num_classes=2,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
    auxiliary_head=dict(
        type='FCNHead',
        in_channels=384,
        in_index=2,
        channels=256,
        num_convs=1,
        concat_input=False,
        dropout_ratio=0.1,
        num_classes=2,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
    train_cfg=dict(),
    test_cfg=dict(mode='whole'))
dataset_type = 'DefinedDataset'
data_root = 'O:/lynn/DLTraing/uav_building_ss_aichallenge'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', reduce_zero_label=True),
    dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
    dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PhotoMetricDistortion'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2048, 512),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
train_pipline_2 = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations'),
    dict(
        type='Resize',
        img_scale=[(480, 1024), (512, 1024), (544, 1024), (576, 1024),
                   (608, 1024), (640, 1024), (672, 1024), (704, 1024),
                   (736, 1024), (768, 1024), (800, 1024)],
        multiscale_mode='value',
        keep_ratio=True),
    dict(type='RandomCrop', crop_size=(768, 768), cat_max_ratio=0.75),
    dict(type='RandomFlip', prob=0.75),
    dict(
        type='Albu',
        transforms=[
            dict(
                type='OneOf',
                p=0.5,
                transforms=[
                    dict(
                        type='RandomBrightnessContrast',
                        brightness_limit=0.3,
                        contrast_limit=0.3,
                        p=0.1),
                    dict(type='RandomGamma', p=1),
                    dict(type='ChannelShuffle', p=0.2),
                    dict(type='HueSaturationValue', p=1),
                    dict(type='RGBShift', p=1)
                ]),
            dict(
                type='OneOf',
                p=0.2,
                transforms=[
                    dict(type='GaussNoise', p=1),
                    dict(type='MultiplicativeNoise', p=1),
                    dict(type='IAASharpen', p=1)
                ])
        ],
        keymap=dict(img='image', gt_seg_map='mask')),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size=(768, 768), pad_val=0, seg_pad_val=0),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=0,
    train=dict(
        type='DefinedDataset',
        data_root='O:/lynn/DLTraing/uav_building_ss_all',
        img_dir='images',
        ann_dir='labels',
        split='splits/train.txt',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations'),
            dict(
                type='Resize',
                img_scale=[(480, 1024), (512, 1024), (544, 1024), (576, 1024),
                           (608, 1024), (640, 1024), (672, 1024), (704, 1024),
                           (736, 1024), (768, 1024), (800, 1024)],
                multiscale_mode='value',
                keep_ratio=True),
            dict(type='RandomCrop', crop_size=(768, 768), cat_max_ratio=0.75),
            dict(type='RandomFlip', prob=0.75),
            dict(
                type='Albu',
                transforms=[
                    dict(
                        type='OneOf',
                        p=0.5,
                        transforms=[
                            dict(
                                type='RandomBrightnessContrast',
                                brightness_limit=0.3,
                                contrast_limit=0.3,
                                p=0.1),
                            dict(type='RandomGamma', p=1),
                            dict(type='ChannelShuffle', p=0.2),
                            dict(type='HueSaturationValue', p=1),
                            dict(type='RGBShift', p=1)
                        ]),
                    dict(
                        type='OneOf',
                        p=0.2,
                        transforms=[
                            dict(type='GaussNoise', p=1),
                            dict(type='MultiplicativeNoise', p=1),
                            dict(type='IAASharpen', p=1)
                        ])
                ],
                keymap=dict(img='image', gt_seg_map='mask')),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size=(768, 768), pad_val=0, seg_pad_val=0),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_semantic_seg'])
        ]),
    val=dict(
        type='DefinedDataset',
        data_root='O:/lynn/DLTraing/uav_building_ss_all',
        img_dir='images',
        ann_dir='labels',
        split='splits/val.txt',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(768, 768),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]),
    test=dict(
        type='DefinedDataset',
        data_root='O:/lynn/DLTraing/uav_building_ss_all',
        img_dir='images',
        ann_dir='labels',
        split='splits/test.txt',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(768, 768),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ]))
log_config = dict(
    interval=1,
    hooks=[
        dict(type='TextLoggerHook', by_epoch=True),
        dict(type='TensorboardLoggerHook', by_epoch=True)
    ])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'E:/mmsegmentation/work_dirs/building_ai_challenge/config/upernet_swin_small_patch4_window7_512x512.pth'
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(
    type='AdamW',
    lr=6e-05,
    betas=(0.9, 0.999),
    weight_decay=0.01,
    paramwise_cfg=dict(
        custom_keys=dict(
            absolute_pos_embed=dict(decay_mult=0.0),
            relative_position_bias_table=dict(decay_mult=0.0),
            norm=dict(decay_mult=0.0))))
optimizer_config = dict()
lr_config = dict(
    policy='poly',
    warmup='linear',
    warmup_iters=1500,
    warmup_ratio=1e-06,
    power=1.0,
    min_lr=0.0,
    by_epoch=True)
runner = dict(type='EpochBasedRunner', max_epochs=50)
checkpoint_config = dict(by_epoch=True, interval=1)
evaluation = dict(
    interval=1, metric=['mIoU', 'mDice'], pre_eval=True, save_best='auto')
checkpoint_file = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/swin/swin_small_patch4_window7_224_20220317-7ba6d6dd.pth'
work_dir = 'E:/mmsegmentation/work_dirs/building_ai_challenge/upernet-swin_size768_ss_all'
gpu_ids = [0]
auto_resume = False

(3)deploy_cfg is segmentation_onnxruntime_static-512x512.py, as follow:
_base_ = ['./segmentation_static.py', '../_base_/backends/onnxruntime.py']
onnx_config = dict(input_shape=[768, 768])

Error traceback

Nothing about Error.

Apr 19 '24 02:04 GSusan

mmdeploy mmdeploy copied to clipboard

[Bug]When I deploy segmentation, the result of SDK occurs discontinuous results. Sometimes the result is normal, but another time it's unnormal.

Checklist

Describe the bug

Reproduction

Environment

Error traceback

mmdeploy
mmdeploy copied to clipboard