SPG icon indicating copy to clipboard operation
SPG copied to clipboard

The problem of pre_train model

Open tim120526 opened this issue 6 years ago • 4 comments

HI!XiaoMeng! The problem occurs when I load the pretrain model you provide at (https://drive.google.com/open?id=1EwRuqfGASarGidutnYB8rXLSuzYpEoSM). the model named imagenet_epoch_2_glo_step_128118.pth.tar. error: Missing key(s) in state_dict: "Conv2d_1a_3x3.conv.weight", "Conv2d_1a_3x3.bn.weight", "Conv2d_1a_3x3.bn.bias", "Conv2d_1a_3x3.bn.running_mean", "Conv2d_1a_3x3.bn.running_var", "Conv2d_2a_3x3.conv.weight", "Conv2d_2a_3x3.bn.weight", "Conv2d_2a_3x3.bn.bias", "Conv2d_2a_3x3.bn.running_mean", "Conv2d_2a_3x3.bn.running_var", "Conv2d_2b_3x3.conv.weight", "Conv2d_2b_3x3.bn.weight", "Conv2d_2b_3x3.bn.bias", "Conv2d_2b_3x3.bn.running_mean", "Conv2d_2b_3x3.bn.running_var", "Conv2d_3b_1x1.conv.weight", "Conv2d_3b_1x1.bn.weight", "Conv2d_3b_1x1.bn.bias", "Conv2d_3b_1x1.bn.running_mean", "Conv2d_3b_1x1.bn.running_var", "Conv2d_4a_3x3.conv.weight", "Conv2d_4a_3x3.bn.weight", "Conv2d_4a_3x3.bn.bias", "Conv2d_4a_3x3.bn.running_mean", "Conv2d_4a_3x3.bn.running_var", "Mixed_5b.branch1x1.conv.weight", "Mixed_5b.branch1x1.bn.weight", "Mixed_5b.branch1x1.bn.bias", "Mixed_5b.branch1x1.bn.running_mean", "Mixed_5b.branch1x1.bn.running_var", "Mixed_5b.branch5x5_1.conv.weight", "Mixed_5b.branch5x5_1.bn.weight", "Mixed_5b.branch5x5_1.bn.bias", "Mixed_5b.branch5x5_1.bn.running_mean", "Mixed_5b.branch5x5_1.bn.running_var", "Mixed_5b.branch5x5_2.conv.weight", "Mixed_5b.branch5x5_2.bn.weight", "Mixed_5b.branch5x5_2.bn.bias", "Mixed_5b.branch5x5_2.bn.running_mean", "Mixed_5b.branch5x5_2.bn.running_var", "Mixed_5b.branch3x3dbl_1.conv.weight", "Mixed_5b.branch3x3dbl_1.bn.weight", "Mixed_5b.branch3x3dbl_1.bn.bias", "Mixed_5b.branch3x3dbl_1.bn.running_mean", "Mixed_5b.branch3x3dbl_1.bn.running_var", "Mixed_5b.branch3x3dbl_2.conv.weight", "Mixed_5b.branch3x3dbl_2.bn.weight", "Mixed_5b.branch3x3dbl_2.bn.bias", "Mixed_5b.branch3x3dbl_2.bn.running_mean", "Mixed_5b.branch3x3dbl_2.bn.running_var", "Mixed_5b.branch3x3dbl_3.conv.weight", "Mixed_5b.branch3x3dbl_3.bn.weight", "Mixed_5b.branch3x3dbl_3.bn.bias", "Mixed_5b.branch3x3dbl_3.bn.running_mean", "Mixed_5b.branch3x3dbl_3.bn.running_var", "Mixed_5b.branch_pool.conv.weight", "Mixed_5b.branch_pool.bn.weight", "Mixed_5b.branch_pool.bn.bias", "Mixed_5b.branch_pool.bn.running_mean", "Mixed_5b.branch_pool.bn.running_var", "Mixed_5c.branch1x1.conv.weight", "Mixed_5c.branch1x1.bn.weight", "Mixed_5c.branch1x1.bn.bias", "Mixed_5c.branch1x1.bn.running_mean", "Mixed_5c.branch1x1.bn.running_var", "Mixed_5c.branch5x5_1.conv.weight", "Mixed_5c.branch5x5_1.bn.weight", "Mixed_5c.branch5x5_1.bn.bias", "Mixed_5c.branch5x5_1.bn.running_mean", "Mixed_5c.branch5x5_1.bn.running_var", "Mixed_5c.branch5x5_2.conv.weight", "Mixed_5c.branch5x5_2.bn.weight", "Mixed_5c.branch5x5_2.bn.bias", "Mixed_5c.branch5x5_2.bn.running_mean", "Mixed_5c.branch5x5_2.bn.running_var", "Mixed_5c.branch3x3dbl_1.conv.weight", "Mixed_5c.branch3x3dbl_1.bn.weight", "Mixed_5c.branch3x3dbl_1.bn.bias", "Mixed_5c.branch3x3dbl_1.bn.running_mean", "Mixed_5c.branch3x3dbl_1.bn.running_var", "Mixed_5c.branch3x3dbl_2.conv.weight", "Mixed_5c.branch3x3dbl_2.bn.weight", "Mixed_5c.branch3x3dbl_2.bn.bias", "Mixed_5c.branch3x3dbl_2.bn.running_mean", "Mixed_5c.branch3x3dbl_2.bn.running_var", "Mixed_5c.branch3x3dbl_3.conv.weight", "Mixed_5c.branch3x3dbl_3.bn.weight", "Mixed_5c.branch3x3dbl_3.bn.bias", "Mixed_5c.branch3x3dbl_3.bn.running_mean", "Mixed_5c.branch3x3dbl_3.bn.running_var", "Mixed_5c.branch_pool.conv.weight", "Mixed_5c.branch_pool.bn.weight", "Mixed_5c.branch_pool.bn.bias", "Mixed_5c.branch_pool.bn.running_mean", "Mixed_5c.branch_pool.bn.running_var", "Mixed_5d.branch1x1.conv.weight", "Mixed_5d.branch1x1.bn.weight", ........

it reveal the the model cant match the net defined. Is there some mistake in the model I downloaded?Please enlighten me. Thank you very much!

tim120526 avatar Sep 13 '18 13:09 tim120526

Did you try the latest code? I tested it now. The pre-trained weights can be successfully loaded.
If it is still not working for you, can you provide the shell script for using the weights?

xiaomengyc avatar Sep 14 '18 00:09 xiaomengyc

Hi @xiaomengyc I noticed that your inception v3 model is a little different from the official model definition of torchvision, but they use the same pre-trained weights (inception_v3_google-1a9a5a14.pth). The different places are as follows:

  1. padding of Conv2d_1a_3x3: yours, torchvision
  2. padding and stride of Mixed_6a: yours, torchvision
  3. padding of max_pool2d: yours, torchvision

Did you make that change? Could you explain it for me?

Thanks!

yeezhu avatar Sep 16 '18 13:09 yeezhu

Hi @yeezhu,

  1. I added padding because torchvision by default gives us the different resolution of feature maps from the same feature maps in Caffe and Tensorflow. I keep the resolutions same with the baseline methods by adding padding operation.
  2. The stride of Mixed_6a is also for keeping a relative better resolution of the final heatmaps.

xiaomengyc avatar Sep 17 '18 00:09 xiaomengyc

Thank you! @xiaomengyc I modify the code in inception_spg.py to construct SPG-plain and train it on CUB, but it doesn't converge. Here are my settings:

  1. Pytorch 0.4.0, python 3.6, cuda 8.0

  2. lr=0.001 for the pretrained weights (before Mixed_6e), lr = 0.01 for others. momentum=0.9, weight_decay=0.0005. (based on the description in section 3.3 of the paper)

  3. The code in inception_spg.py that I changed:

     feat = self.Mixed_6e(x) 
     # side3 = self.side3(x)
     # side3 = self.side_all(side3)
     # 28 x 28 x 192
     x = self.Mixed_6a(x)
     # 28 x 28 x 768
     x = self.Mixed_6b(x)
     # 28 x 28 x 768
     x = self.Mixed_6c(x)
     # 28 x 28 x 768
     x = self.Mixed_6d(x)
     # 28 x 28 x 768
     feat = self.Mixed_6e(x)
    
     # side4 = self.side4(x)
     # side4 = self.side_all(side4)
    
     #Branch 1
     out1, last_feat = self.inference(feat, label=label)
     # self.map1 = out1
    
     # atten_map = self.get_atten_map(self.interp(out1), label, True)
    
     #Branch B
     # out_seg = self.branchB(last_feat)
    
     logits_1 = torch.mean(torch.mean(out1, dim=2), dim=2)
    
     # return [logits_1, side3, side4, out_seg, atten_map]
     return logits_1
    

So, can you share the detailed settings for training CUB on SPG-plain? Many thanks!

yeezhu avatar Sep 17 '18 06:09 yeezhu