Setup instruction does not work with training instruction
I have installed the following versions according to SOLO installation instructions using the settings from Dockerfile requesting pytorch 1.3 cuda 10.1 in an Anaconda environment:
According to torchvision setup I have used pytorch 1.3.1 along with torchvision 0.4.2
Then I have mapped MS COCO (2017) into data/coco
According to the training instruction
I start training with
python tools/train.py configs/solov2/solov2_r101_3x.py
which results in the following
ERROR1: ModuleNotFoundError: No module named 'mmcv.cnn.weight_init'
(solo-pt131-cu101) dig_ccm@digs113:~/projects/solov2$ python tools/train.py configs/solov2/solov2_r101_3x.py
Traceback (most recent call last):
File "tools/train.py", line 13, in <module>
from mmdet.apis import set_random_seed, train_detector
File "/home/dig_ccm/projects/solov2/mmdet/apis/__init__.py", line 1, in <module>
from .inference import (async_inference_detector, inference_detector,
File "/home/dig_ccm/projects/solov2/mmdet/apis/inference.py", line 13, in <module>
from mmdet.models import build_detector
File "/home/dig_ccm/projects/solov2/mmdet/models/__init__.py", line 3, in <module>
from .bbox_heads import * # noqa: F401,F403
File "/home/dig_ccm/projects/solov2/mmdet/models/bbox_heads/__init__.py", line 3, in <module>
from .double_bbox_head import DoubleConvFCBBoxHead
File "/home/dig_ccm/projects/solov2/mmdet/models/bbox_heads/double_bbox_head.py", line 2, in <module>
from mmcv.cnn.weight_init import normal_init, xavier_init
ModuleNotFoundError: No module named 'mmcv.cnn.weight_init'
For a quick fix I have changed mmcv.cnn.weight_init to mmcv.cnn in double_bbox_head.py and hrfpn.py
Starting training again with
python tools/train.py configs/solov2/solov2_r101_3x.py
which comes up with a warning: unexpected key in source state_dict: fc.weight, fc.bias
and then fails with TypeError: impad() takes 1 positional argument but 2 positional arguments (and 1 keyword-only argument) were given Segmentation fault
Can someone please give me a hint, what's wrong here?
@JRGit4UE It seems that the mmcv version is not correct.