SPG
SPG copied to clipboard
The problem of pre_train model
HI!XiaoMeng! The problem occurs when I load the pretrain model you provide at (https://drive.google.com/open?id=1EwRuqfGASarGidutnYB8rXLSuzYpEoSM). the model named imagenet_epoch_2_glo_step_128118.pth.tar. error: Missing key(s) in state_dict: "Conv2d_1a_3x3.conv.weight", "Conv2d_1a_3x3.bn.weight", "Conv2d_1a_3x3.bn.bias", "Conv2d_1a_3x3.bn.running_mean", "Conv2d_1a_3x3.bn.running_var", "Conv2d_2a_3x3.conv.weight", "Conv2d_2a_3x3.bn.weight", "Conv2d_2a_3x3.bn.bias", "Conv2d_2a_3x3.bn.running_mean", "Conv2d_2a_3x3.bn.running_var", "Conv2d_2b_3x3.conv.weight", "Conv2d_2b_3x3.bn.weight", "Conv2d_2b_3x3.bn.bias", "Conv2d_2b_3x3.bn.running_mean", "Conv2d_2b_3x3.bn.running_var", "Conv2d_3b_1x1.conv.weight", "Conv2d_3b_1x1.bn.weight", "Conv2d_3b_1x1.bn.bias", "Conv2d_3b_1x1.bn.running_mean", "Conv2d_3b_1x1.bn.running_var", "Conv2d_4a_3x3.conv.weight", "Conv2d_4a_3x3.bn.weight", "Conv2d_4a_3x3.bn.bias", "Conv2d_4a_3x3.bn.running_mean", "Conv2d_4a_3x3.bn.running_var", "Mixed_5b.branch1x1.conv.weight", "Mixed_5b.branch1x1.bn.weight", "Mixed_5b.branch1x1.bn.bias", "Mixed_5b.branch1x1.bn.running_mean", "Mixed_5b.branch1x1.bn.running_var", "Mixed_5b.branch5x5_1.conv.weight", "Mixed_5b.branch5x5_1.bn.weight", "Mixed_5b.branch5x5_1.bn.bias", "Mixed_5b.branch5x5_1.bn.running_mean", "Mixed_5b.branch5x5_1.bn.running_var", "Mixed_5b.branch5x5_2.conv.weight", "Mixed_5b.branch5x5_2.bn.weight", "Mixed_5b.branch5x5_2.bn.bias", "Mixed_5b.branch5x5_2.bn.running_mean", "Mixed_5b.branch5x5_2.bn.running_var", "Mixed_5b.branch3x3dbl_1.conv.weight", "Mixed_5b.branch3x3dbl_1.bn.weight", "Mixed_5b.branch3x3dbl_1.bn.bias", "Mixed_5b.branch3x3dbl_1.bn.running_mean", "Mixed_5b.branch3x3dbl_1.bn.running_var", "Mixed_5b.branch3x3dbl_2.conv.weight", "Mixed_5b.branch3x3dbl_2.bn.weight", "Mixed_5b.branch3x3dbl_2.bn.bias", "Mixed_5b.branch3x3dbl_2.bn.running_mean", "Mixed_5b.branch3x3dbl_2.bn.running_var", "Mixed_5b.branch3x3dbl_3.conv.weight", "Mixed_5b.branch3x3dbl_3.bn.weight", "Mixed_5b.branch3x3dbl_3.bn.bias", "Mixed_5b.branch3x3dbl_3.bn.running_mean", "Mixed_5b.branch3x3dbl_3.bn.running_var", "Mixed_5b.branch_pool.conv.weight", "Mixed_5b.branch_pool.bn.weight", "Mixed_5b.branch_pool.bn.bias", "Mixed_5b.branch_pool.bn.running_mean", "Mixed_5b.branch_pool.bn.running_var", "Mixed_5c.branch1x1.conv.weight", "Mixed_5c.branch1x1.bn.weight", "Mixed_5c.branch1x1.bn.bias", "Mixed_5c.branch1x1.bn.running_mean", "Mixed_5c.branch1x1.bn.running_var", "Mixed_5c.branch5x5_1.conv.weight", "Mixed_5c.branch5x5_1.bn.weight", "Mixed_5c.branch5x5_1.bn.bias", "Mixed_5c.branch5x5_1.bn.running_mean", "Mixed_5c.branch5x5_1.bn.running_var", "Mixed_5c.branch5x5_2.conv.weight", "Mixed_5c.branch5x5_2.bn.weight", "Mixed_5c.branch5x5_2.bn.bias", "Mixed_5c.branch5x5_2.bn.running_mean", "Mixed_5c.branch5x5_2.bn.running_var", "Mixed_5c.branch3x3dbl_1.conv.weight", "Mixed_5c.branch3x3dbl_1.bn.weight", "Mixed_5c.branch3x3dbl_1.bn.bias", "Mixed_5c.branch3x3dbl_1.bn.running_mean", "Mixed_5c.branch3x3dbl_1.bn.running_var", "Mixed_5c.branch3x3dbl_2.conv.weight", "Mixed_5c.branch3x3dbl_2.bn.weight", "Mixed_5c.branch3x3dbl_2.bn.bias", "Mixed_5c.branch3x3dbl_2.bn.running_mean", "Mixed_5c.branch3x3dbl_2.bn.running_var", "Mixed_5c.branch3x3dbl_3.conv.weight", "Mixed_5c.branch3x3dbl_3.bn.weight", "Mixed_5c.branch3x3dbl_3.bn.bias", "Mixed_5c.branch3x3dbl_3.bn.running_mean", "Mixed_5c.branch3x3dbl_3.bn.running_var", "Mixed_5c.branch_pool.conv.weight", "Mixed_5c.branch_pool.bn.weight", "Mixed_5c.branch_pool.bn.bias", "Mixed_5c.branch_pool.bn.running_mean", "Mixed_5c.branch_pool.bn.running_var", "Mixed_5d.branch1x1.conv.weight", "Mixed_5d.branch1x1.bn.weight", ........
it reveal the the model cant match the net defined. Is there some mistake in the model I downloaded?Please enlighten me. Thank you very much!
Did you try the latest code? I tested it now. The pre-trained weights can be successfully loaded.
If it is still not working for you, can you provide the shell script for using the weights?
Hi @xiaomengyc I noticed that your inception v3 model is a little different from the official model definition of torchvision, but they use the same pre-trained weights (inception_v3_google-1a9a5a14.pth). The different places are as follows:
- padding of Conv2d_1a_3x3: yours, torchvision
- padding and stride of Mixed_6a: yours, torchvision
- padding of max_pool2d: yours, torchvision
Did you make that change? Could you explain it for me?
Thanks!
Hi @yeezhu,
- I added padding because torchvision by default gives us the different resolution of feature maps from the same feature maps in Caffe and Tensorflow. I keep the resolutions same with the baseline methods by adding padding operation.
- The stride of Mixed_6a is also for keeping a relative better resolution of the final heatmaps.
Thank you! @xiaomengyc
I modify the code in inception_spg.py
to construct SPG-plain and train it on CUB, but it doesn't converge.
Here are my settings:
-
Pytorch 0.4.0, python 3.6, cuda 8.0
-
lr=0.001 for the pretrained weights (before Mixed_6e), lr = 0.01 for others. momentum=0.9, weight_decay=0.0005. (based on the description in section 3.3 of the paper)
-
The code in
inception_spg.py
that I changed:feat = self.Mixed_6e(x) # side3 = self.side3(x) # side3 = self.side_all(side3) # 28 x 28 x 192 x = self.Mixed_6a(x) # 28 x 28 x 768 x = self.Mixed_6b(x) # 28 x 28 x 768 x = self.Mixed_6c(x) # 28 x 28 x 768 x = self.Mixed_6d(x) # 28 x 28 x 768 feat = self.Mixed_6e(x) # side4 = self.side4(x) # side4 = self.side_all(side4) #Branch 1 out1, last_feat = self.inference(feat, label=label) # self.map1 = out1 # atten_map = self.get_atten_map(self.interp(out1), label, True) #Branch B # out_seg = self.branchB(last_feat) logits_1 = torch.mean(torch.mean(out1, dim=2), dim=2) # return [logits_1, side3, side4, out_seg, atten_map] return logits_1
So, can you share the detailed settings for training CUB on SPG-plain? Many thanks!