pytorch-ssd evaluation of ssd_mobilenetv1

I'm trying to import the ssd_mobilenetv1_coco_2018, converting it from Tensotflow (.pb) to pytorch (.pth). After the conversion, I wanted to evaluate it with the webcam input but I noticed that there is a mismatch between some layer settings in the SSD class and the pretrained model corresponding to the last EXTRA conv layers/classification_headers/regression_headers.

I had to edit your code in the create_mobilenetv1_ssd like this


extras = ModuleList([
        Sequential(
            Conv2d(in_channels=1024, out_channels=256, kernel_size=1),
            ReLU(),
            Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=2, padding=1),
            ReLU()
        ),
        Sequential(
            Conv2d(in_channels=512, out_channels=128, kernel_size=1),
            ReLU(),
            Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=2, padding=1),
            ReLU()
        ),
        Sequential(
            Conv2d(in_channels=256, out_channels=128, kernel_size=1),
            ReLU(),
            Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=2, padding=1),
            ReLU()
        ),
        Sequential(
            Conv2d(in_channels=256, out_channels=64, kernel_size=1),
            ReLU(),
            Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=2, padding=1),
            ReLU()
        )
    ])

    regression_headers = ModuleList([
        Conv2d(in_channels=512, out_channels=3 * 4, kernel_size=1, padding=1),
        Conv2d(in_channels=1024, out_channels=6 * 4, kernel_size=1, padding=1),
        Conv2d(in_channels=512, out_channels=6 * 4, kernel_size=1, padding=1),
        Conv2d(in_channels=256, out_channels=6 * 4, kernel_size=1, padding=1),
        Conv2d(in_channels=256, out_channels=6 * 4, kernel_size=1, padding=1),
        Conv2d(in_channels=128, out_channels=6 * 4, kernel_size=1, padding=1), 
    ])

    classification_headers = ModuleList([
        Conv2d(in_channels=512, out_channels=3 * num_classes, kernel_size=1, padding=1),
        Conv2d(in_channels=1024, out_channels=6 * num_classes, kernel_size=1, padding=1),
        Conv2d(in_channels=512, out_channels=6 * num_classes, kernel_size=1, padding=1),
        Conv2d(in_channels=256, out_channels=6 * num_classes, kernel_size=1, padding=1),
        Conv2d(in_channels=256, out_channels=6 * num_classes, kernel_size=1, padding=1),
        Conv2d(in_channels=128, out_channels=6 * num_classes, kernel_size=1, padding=1), 
    ])

This caused the execution of run_converted_pytorch_ssd_live_demo.py to crash with this error: RuntimeError: The size of tensor a (2781) must match the size of tensor b (3000) at non-singleton dimension 1

Is it possible that the SSD mobilenet architecture has been modified in time and some new adjustments have to be made in order to keep the code correct? Or it's just me that I'm missing something?

Thanks

Feb 25 '19 10:02 kamauz

hi @kamauz , the priors/anchors needed in your model and the way of branching out paths to detection head might be different.

Mar 04 '19 21:03 qfgaohao

@kamauz Hi, do you find the solution? I also changed the network structure and faced the same problem as yours.

Oct 19 '19 08:10 notabigfish

@qfgaohao Hi, in this situation, should I change the parameters of SSDSpec in the config file? Thank you!

Oct 19 '19 08:10 notabigfish

@qfgaohao I ve abandoned this repository time ago for this problem. I remember that I tried to change SSDSpec but I couldn't make it work. I don't exclude that the solution was there by the way. It is tricky maybe

Oct 19 '19 09:10 kamauz

@kamauz @notabigfish you can also change the number of channels of extra layers to make the network output is consistent with the generated anchors.

Oct 19 '19 11:10 qfgaohao

Thank you for the answers!! Actually, similar to @kamauz , I delete all the BatchNorm layers after Pointwise Conv layer and got location with size [..., 1434] but priors with size [..., 3000]. The reason is that the first feature map size in vision/ssd/ssd.py line 58 is [1, 576, 10, 10], but it should be [1, 576, 19, 19]. So I changed line 14 in vision/ssd/config/mobilenetv1_ssd_config.py to

SSDSpec(10, 16, SSDBoxSizes(60, 105), [2, 3])

Then the error that @kamauz mentioned is fixed. However, a new error shows up:

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimention 1. Got 41 and 47 in dimension 0 at /pytorch/aten/src/TH/generic/THTensor.cpp:711

It turns out after changing the configuration, some labels got size torch.Size([0]).

I did not change any convolution layer but only deleted the batchnorm layer. So in theory the output channels are unchanged, right? Or maybe I miss something? Thank you!! @qfgaohao

Oct 21 '19 02:10 notabigfish

@notabigfish Are you trying in the same situation that I tried? So with the PTH file got from a conversion of the official tensorflow model? By the way months ago I assumed that maybe Google developers experimented a different version with respect to the official SSD paper. I don't know if removing the BatchNorm could be a good idea. Maybe it's a matter of implemented choices that we can't know unless we can see how they actually trained the network. Keep me updated if you find a solution

Oct 22 '19 18:10 kamauz

pytorch-ssd
pytorch-ssd copied to clipboard

evaluation of ssd_mobilenetv1_coco

pytorch-ssd pytorch-ssd copied to clipboard

evaluation of ssd_mobilenetv1_coco

pytorch-ssd
pytorch-ssd copied to clipboard