mmsegmentation icon indicating copy to clipboard operation
mmsegmentation copied to clipboard

the inference speed of ICNet gradually increases.

Open Howietzh opened this issue 3 years ago • 4 comments
trafficstars

when runing the script below, I found the inference speed of icnet increased gradually.

CUDA_VISIBLE_DEVICES=0 python -u tools/benchmark.py configs/coarse_position/icnet_r18-d8_512x612_20k_coarseposition.py work_dirs/icnet_r50-d8_512x612_20k_coarseposition/latest.pth --repeat-times 5 image image Uploading image.png…

I have tried to increase the num_warmup to 1000 and total_iters to 1200, but the problem remains unsolved.

my config file is as below: norm_cfg = dict(type='SyncBN', requires_grad=True) model = dict( type='EncoderDecoder', backbone=dict( type='ICNet', backbone_cfg=dict( type='ResNetV1c', in_channels=3, depth=18, num_stages=4, out_indices=(0, 1, 2, 3), dilations=(1, 1, 2, 4), strides=(1, 2, 1, 1), norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False, style='pytorch', contract_dilation=True), in_channels=3, layer_channels=(128, 512), light_branch_middle_channels=32, psp_out_channels=512, out_channels=(64, 256, 256), norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False), neck=dict( type='ICNeck', in_channels=(64, 256, 256), out_channels=128, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False), decode_head=dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, in_index=2, dropout_ratio=0, num_classes=4, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), auxiliary_head=[ dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, num_classes=4, in_index=0, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, num_classes=4, in_index=1, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)) ], train_cfg=dict(), test_cfg=dict(mode='whole')) dataset_type = 'CustomDataset' data_root = '../DataSets/CoarsePosition' classes = ('back_ground', 'headface', 'fpc', 'connector') palette = [[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]] img_norm_cfg = dict( mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True) crop_size = (612, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(612, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(612, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='Pad', size=(612, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/train', ann_dir='annotations/train', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(612, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(612, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='Pad', size=(612, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ]), val=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/val', ann_dir='annotations/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/val', ann_dir='annotations/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] cudnn_benchmark = True optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) optimizer_config = dict() lr_config = dict(policy='poly', power=0.9, min_lr=0.0001, by_epoch=False) runner = dict(type='IterBasedRunner', max_iters=20000) checkpoint_config = dict(by_epoch=False, interval=2000) evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) work_dir = 'work_dirs/icnet_r50-d8_512x612_20k_coarseposition/' gpu_ids = range(0, 8) auto_resume = False

Howietzh avatar Sep 28 '22 03:09 Howietzh

How many times have you observed this phenomenon?Did you check the CPU usage when you ran the experiments

MeowZheng avatar Sep 28 '22 12:09 MeowZheng

I run this script a lot of times. this phenomenon exists every time. I checked the CPU usage and its up to 273%. that can't be good, right? how to handle this problem? please! by the way, I only run the benchmark.py on the server and its cpu usage is up to 273%。

Howietzh avatar Sep 29 '22 00:09 Howietzh

It might be a little unreasonable.

If you run this script many times, and the speed is increased every time, i.e. the inference time drops every time. Will time drop to 0? This is clearly not possible. Therefore, the inference speed will be stable definitely.

I think the reason might be that your device. Please monitor the CPU/GPU usage of other tasks on your devices

MeowZheng avatar Oct 11 '22 05:10 MeowZheng

did you find the solution to your problem? could you share it?

epris avatar May 17 '24 12:05 epris