I tried solov2 to train the barrier, but i get the bad result!
My data has 4 classes(not include background), i used the configs is solov2_r50_fpn_8gpu_3x.py, i didn't change the dufault params, but i get the bad results!

I tested the AP is about 52, AR is about 47. So someone could figure my confussion, why i get the bad results?
@zzzz737 what's your define of barriers?
@zzzz737 what's your define of barriers?

like this,all 4 classes
Did you train from scratch? If you can provide the config file, I am happy to help check it.
Did you train from scratch? If you can provide the config file, I am happy to help check it.
model settings
model = dict( type='SOLOv2',
pretrained='torchvision://resnet18',
pretrained=None,
backbone=dict(
type='ResNet',
depth=18,
num_stages=4,
out_indices=(0, 1, 2, 3), # C2, C3, C4, C5
frozen_stages=1,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[64, 128, 256, 512],
out_channels=256,
start_level=0,
num_outs=5),
bbox_head=dict(
type='SOLOv2Head',
num_classes=81,
in_channels=256,
stacked_convs=2,
seg_feat_channels=256,
strides=[8, 8, 16, 32, 32],
scale_ranges=((1, 56), (28, 112), (56, 224), (112, 448), (224, 896)),
sigma=0.2,
num_grids=[40, 36, 24, 16, 12],
ins_out_channels=128,
loss_ins=dict(
type='DiceLoss',
use_sigmoid=True,
loss_weight=3.0),
loss_cate=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0)),
mask_feat_head=dict(
type='MaskFeatHead',
in_channels=256,
out_channels=128,
start_level=0,
end_level=3,
num_classes=128,
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
)
training and testing settings
train_cfg = dict() test_cfg = dict( nms_pre=500, score_thr=0.1, mask_thr=0.5, update_thr=0.05, kernel='gaussian', # gaussian/linear sigma=2.0, max_per_img=100)
dataset settings
dataset_type = 'CocoDataset'
data_root = 'data/coco/'
data_root = '/root/data/barrier_instance_samples_coco_format/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True, with_mask=True), dict(type='Resize', img_scale=[(768, 512), (768, 480), (768, 448), (768, 416), (768, 384), (768, 352)], multiscale_mode='value', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(768, 448), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( imgs_per_gpu=16, workers_per_gpu=4, train=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_train2017.json', img_prefix=data_root + 'train2017/', pipeline=train_pipeline), val=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_val2017.json', img_prefix=data_root + 'val2017/', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file=data_root + 'annotations/instances_val2017.json', img_prefix=data_root + 'val2017/', pipeline=test_pipeline))
optimizer
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
learning policy
lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.01, step=[27, 33]) checkpoint_config = dict(interval=1)
yapf:disable
log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), # dict(type='TensorboardLoggerHook') ])
yapf:enable
runtime settings
total_epochs = 300 device_ids = range(8) dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/solov2_light_release_r18_fpn_8gpu_3x' load_from = None resume_from = None workflow = [('train', 1)]
This is my config file, thanks for helping me!
@zzzz737 Can you try some larger input size?
My data has 4 classes(not include background), i used the configs is solov2_r50_fpn_8gpu_3x.py, i didn't change the dufault params, but i get the bad results!
I tested the AP is about 52, AR is about 47. So someone could figure my confussion, why i get the bad results?
Hi! What dataset are you using? Is it private or public? Can you share it?
Thank you!