CrossStagePartialNetworks icon indicating copy to clipboard operation
CrossStagePartialNetworks copied to clipboard

When I use the csresnext50-panet-spp.cfg to train

Open Timmmmmms opened this issue 4 years ago • 5 comments

RuntimeError: shape '[512, 2048, 1, 1]' is invalid for input of size 680093

Can you tell me what is wrong?

Timmmmmms avatar Dec 16 '19 14:12 Timmmmmms

@Timmmmmms

I guess you run the code using https://github.com/ultralytics/yolov3 with pretrained model.

If true, please add

    elif file == 'csresnext50c.conv.80': # change to your pretrain model name
        cutoff = 80

in https://github.com/ultralytics/yolov3/blob/master/models.py#L324

And because of https://github.com/ultralytics/yolov3 does not support add different number of channel using shortcut layer, you should modify the filter number https://github.com/ultralytics/yolov3/issues/698#issuecomment-563906649.

~~However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.~~ ~~So when you use pretrain model, you may not get expected results.~~ https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/3#issuecomment-566314619

WongKinYiu avatar Dec 16 '19 14:12 WongKinYiu

@WongKinYiu

However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.

This conversion code works well with yolov3-spp.weights/cfg file despite the fact that yolov3-spp uses route layers with multiple inputs: https://github.com/ultralytics/yolov3#darknet-conversion

AlexeyAB avatar Dec 16 '19 23:12 AlexeyAB

@AlexeyAB Thanks

After I checked the code, it seems same. https://github.com/ultralytics/yolov3/blob/master/models.py#L56 https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L846

I will check why csresnext50-panet-spp can not perform normally after convert to .pt. or train on pytorch.

※update: the implementation of the number of filters of shortcut layer is different, but i am not sure it will really affect the result or not. https://github.com/ultralytics/yolov3/blob/master/models.py#L63

WongKinYiu avatar Dec 17 '19 00:12 WongKinYiu

@WongKinYiu

@Timmmmmms

I guess you run the code using https://github.com/ultralytics/yolov3 with pretrained model.

If true, please add

    elif file == 'csresnext50c.conv.80': # change to your pretrain model name
        cutoff = 80

in https://github.com/ultralytics/yolov3/blob/master/models.py#L324

And because of https://github.com/ultralytics/yolov3 does not support add different number of channel using shortcut layer, you should modify the filter number ultralytics/yolov3#698 (comment).

~However, I think the order of multiple-inputs-route-layer in https://github.com/AlexeyAB/darknet and https://github.com/ultralytics/yolov3 may be different.~ ~So when you use pretrain model, you may not get expected results.~ #3 (comment)

I try to do that, but there was still a mistake

when I train my data python train.py --data data/coco.data --cfg cfg/csresnext50-panet-spp.cfg

RuntimeError: shape '[512, 512, 3, 3]' is invalid for input of size 1620480

Timmmmmms avatar Dec 17 '19 01:12 Timmmmmms

@Timmmmmms Hello,

use --weights '' if you do not want use pretrained weight.

python train.py --data data/coco.data --weights '' --cfg cfg/csresnext50-panet-spp.cfg

WongKinYiu avatar Dec 17 '19 04:12 WongKinYiu