Setup instruction does not work with training instruction

Open JRGit4UE opened this issue 5 years ago • 1 comments

I have installed the following versions according to SOLO installation instructions using the settings from Dockerfile requesting pytorch 1.3 cuda 10.1 in an Anaconda environment:

According to torchvision setup I have used pytorch 1.3.1 along with torchvision 0.4.2

Then I have mapped MS COCO (2017) into data/coco

According to the training instruction I start training with python tools/train.py configs/solov2/solov2_r101_3x.py which results in the following ERROR1: ModuleNotFoundError: No module named 'mmcv.cnn.weight_init'

(solo-pt131-cu101) dig_ccm@digs113:~/projects/solov2$ python tools/train.py configs/solov2/solov2_r101_3x.py             
Traceback (most recent call last):
  File "tools/train.py", line 13, in <module>
    from mmdet.apis import set_random_seed, train_detector
  File "/home/dig_ccm/projects/solov2/mmdet/apis/__init__.py", line 1, in <module>
    from .inference import (async_inference_detector, inference_detector,
  File "/home/dig_ccm/projects/solov2/mmdet/apis/inference.py", line 13, in <module>
    from mmdet.models import build_detector
  File "/home/dig_ccm/projects/solov2/mmdet/models/__init__.py", line 3, in <module>
    from .bbox_heads import *  # noqa: F401,F403
  File "/home/dig_ccm/projects/solov2/mmdet/models/bbox_heads/__init__.py", line 3, in <module>
    from .double_bbox_head import DoubleConvFCBBoxHead
  File "/home/dig_ccm/projects/solov2/mmdet/models/bbox_heads/double_bbox_head.py", line 2, in <module>
    from mmcv.cnn.weight_init import normal_init, xavier_init
ModuleNotFoundError: No module named 'mmcv.cnn.weight_init'

For a quick fix I have changed mmcv.cnn.weight_init to mmcv.cnn in double_bbox_head.py and hrfpn.py

Starting training again with python tools/train.py configs/solov2/solov2_r101_3x.py

which comes up with a warning: unexpected key in source state_dict: fc.weight, fc.bias

and then fails with TypeError: impad() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given Segmentation fault

Can someone please give me a hint, what's wrong here?

Aug 11 '20 12:08 JRGit4UE

@JRGit4UE It seems that the mmcv version is not correct.

Jan 23 '21 14:01 YuqingWang1029