mmrotate
mmrotate copied to clipboard
使用 rotate fcos 输出模型参数的时候报错 The value is the same before and after calling `init_weights` of RotatedFCOS
Thanks for your error report and we appreciate it a lot.
Checklist
- I have searched related issues but cannot get the expected help.
- I have read the FAQ documentation but cannot get the expected help.
- The bug has not been fixed in the latest version.
Describe the bug A clear and concise description of what the bug is. 在使用rotate fcous模型的时候在输出模型参数的过程中出现
bbox_head.scales.0.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
bbox_head.scales.1.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
bbox_head.scales.2.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
bbox_head.scales.3.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
bbox_head.scales.4.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
bbox_head.scale_angle.scale - torch.Size([]):
The value is the same before and after calling init_weights
of RotatedFCOS
无论是否加载预训练模型都会有该报错
Reproduction
- What command or script did you run? python tools/train.py
A placeholder for the command.
- Did you make any modifications on the code or config? Did you understand what you have modified? rotated_fcos_r50_fpn_1x_dota_le90.py
base = [ '../base/datasets/dotav1.py', '../base/schedules/schedule_1x.py', '../base/default_runtime.py' ] angle_version = 'le90'
model settings
model = dict( type='RotatedFCOS', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, zero_init_residual=False, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), #style='pytorch', #init_cfg=dict(type='Pretrained', checkpoint='./pre_model/resnet50-19c8e357.pth')), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_output', # use P5 num_outs=5, relu_before_extra_convs=True), bbox_head=dict( type='RotatedFCOSHead', num_classes=15, in_channels=256, stacked_convs=4, feat_channels=256, strides=[8, 16, 32, 64, 128], center_sampling=True, center_sample_radius=1.5, norm_on_bbox=True, centerness_on_reg=True, separate_angle=False, scale_angle=True, bbox_coder=dict( type='DistanceAnglePointCoder', angle_version=angle_version), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict(type='RotatedIoULoss', loss_weight=1.0), loss_centerness=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), # training and testing settings train_cfg=None, test_cfg=dict( nms_pre=2000, min_bbox_size=0, score_thr=0.05, nms=dict(iou_thr=0.1), max_per_img=2000))
img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(1024, 1024)), dict( type='RRandomFlip', flip_ratio=[0.25, 0.25, 0.25], direction=['horizontal', 'vertical', 'diagonal'], version=angle_version), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] data = dict( train=dict(pipeline=train_pipeline, version=angle_version), val=dict(version=angle_version), test=dict(version=angle_version)) 预训练模型哪里无论修改与否都会报错
- What dataset did you use?
Environment
mmcv-full 1.3.18 mmdet 2.25.1 mmocr 0.6.1 mmrotate 0.3.2 torch 1.6.0+cu101 torchvision 0.7.0+cu101
- Please run
python mmrotate/utils/collect_env.py
to collect necessary environment information and paste it here. sys.platform: linux Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0] CUDA available: False GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.7.0 PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.8.1 OpenCV: 4.6.0 MMCV: 1.6.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMRotate: 0.3.2+c62f148
- You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as
$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)
Error traceback If applicable, paste the error trackback here.
A placeholder for trackback.
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
We recommend using English or English & Chinese for issues so that we could have broader discussion.
在跑image_demo.py的时候也会卡住或者报错,且没有显示报错信息,只显示了 load checkpoint from local path:
/home/kas/ori_code/watermark/check/mmrotate/mmrotate/utils/setup_env.py:39: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /home/kas/ori_code/watermark/check/mmrotate/mmrotate/utils/setup_env.py:49: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' 2022-10-14 10:25:06,328 - mmrotate - INFO - Environment info:
sys.platform: linux Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.24 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.7.1+cu101 PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
- CuDNN 7.6.3
- Magma 2.5.2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.8.2+cu101 OpenCV: 4.1.1 MMCV: 1.6.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMRotate: 0.3.2+c62f148
2022-10-14 10:25:07,164 - mmrotate - INFO - Distributed training: False 2022-10-14 10:25:08,045 - mmrotate - INFO - Config: dataset_type = 'DOTADataset' data_root = '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(1024, 1024)), dict( type='RRandomFlip', flip_ratio=[0.25, 0.25, 0.25], direction=['horizontal', 'vertical', 'diagonal'], version='le90'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1024, 1024), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='DOTADataset', ann_file= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/trainval/annfiles/', img_prefix= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/trainval/images/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(1024, 1024)), dict( type='RRandomFlip', flip_ratio=[0.25, 0.25, 0.25], direction=['horizontal', 'vertical', 'diagonal'], version='le90'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], version='le90'), val=dict( type='DOTADataset', ann_file= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/trainval/annfiles/', img_prefix= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/trainval/images/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1024, 1024), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ], version='le90'), test=dict( type='DOTADataset', ann_file= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/test/images/', img_prefix= '/home/kas/ori_code/watermark/east_data_924/1011_add_doc_data_mmrotate/split_data/test/images/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1024, 1024), flip=False, transforms=[ dict(type='RResize'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img']) ]) ], version='le90')) evaluation = dict(interval=1, metric='mAP') optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] opencv_num_threads = 0 mp_start_method = 'fork' angle_version = 'le90' model = dict( type='RotatedFCOS', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, zero_init_residual=False, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5, relu_before_extra_convs=True), bbox_head=dict( type='RotatedFCOSHead', num_classes=1, in_channels=256, stacked_convs=4, feat_channels=256, strides=[8, 16, 32, 64, 128], center_sampling=True, center_sample_radius=1.5, norm_on_bbox=True, centerness_on_reg=True, separate_angle=False, scale_angle=True, bbox_coder=dict(type='DistanceAnglePointCoder', angle_version='le90'), loss_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_bbox=dict(type='RotatedIoULoss', loss_weight=1.0), loss_centerness=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), train_cfg=None, test_cfg=dict( nms_pre=2000, min_bbox_size=0, score_thr=0.05, nms=dict(iou_thr=0.1), max_per_img=2000)) work_dir = 'run' auto_resume = False gpu_ids = range(0, 1) 2022-10-14 10:25:08,045 - mmrotate - INFO - Set random seed to 299207498, deterministic: False 2022-10-14 10:25:16,776 - mmrotate - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2022-10-14 10:25:49,269 - mmrotate - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2022-10-14 10:25:51,575 - mmrotate - INFO - initialize RotatedFCOSHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01, 'override': {'type': 'Normal', 'name': 'conv_cls', 'std': 0.01, 'bias_prob': 0.01}} 没有输出报错信息
As far as we know, The value is the same before and after calling init_weights of RotatedFCOS
will not affect the running of the model. The model may be stopped for other reasons. Can you run other models normally?
@liuyanyi Please have a look.
If the log stuck at the initialize with init_cfg
, maybe the process stuck at loading annotations? You can wait a little bit longer to see if it works.
But for image_demo, it doesn't need to read annotations, you can manually terminate the code when it gets stuck to see the stuck line.
我自己的问题,没有给足够的显存,是卡在了torch读取模型哪里了
训练的时候The value is the same before and after calling init_weights
这个问题解决了吗,我也遇到这个问题 @locusbear
训练的时候The value is the same before and after calling
init_weights
这个问题解决了吗,我也遇到这个问题 @locusbear
解决了的,是我没有给足够的显存,导致程序在读取加载模型那里卡住了,也不会报错
请问这会影响最终的结果吗,我也显示这个,最终结果很低