TRN-pytorch icon indicating copy to clipboard operation
TRN-pytorch copied to clipboard

Dimensions in model doesn't match the dimensions in the checkpoint

Open michaelzhang917 opened this issue 6 years ago • 7 comments

I was trying to run the test_video.py and got the following error messages.

Traceback (most recent call last): File "test_video.py", line 104, in img_feature_dim=args.img_feature_dim, print_spec=False) File "/home/vbalab/projects/SimpleMovementDetection/TRN-pytorch/models.py", line 43, in init self._prepare_base_model(base_model) File "/home/vbalab/projects/SimpleMovementDetection/TRN-pytorch/models.py", line 120, in _prepare_base_model self.base_model = getattr(model_zoo, base_model)() File "/home/vbalab/projects/SimpleMovementDetection/TRN-pytorch/model_zoo/bninception/pytorch_load.py", line 67, in init super(InceptionV3, self).init(model_path=model_path, weight_url=weight_url, num_classes=num_classes) File "/home/vbalab/projects/SimpleMovementDetection/TRN-pytorch/model_zoo/bninception/pytorch_load.py", line 35, in init self.load_state_dict(torch.utils.model_zoo.load_url(weight_url)) File "/home/vbalab/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for InceptionV3: While copying the parameter named "conv_batchnorm.weight", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_batchnorm.bias", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_batchnorm.running_mean", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_batchnorm.running_var", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_1_batchnorm.weight", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_1_batchnorm.bias", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_1_batchnorm.running_mean", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_1_batchnorm.running_var", whose dimensions in the model are torch.Size([32]) and whose dimensions in the checkpoint are torch.Size([1, 32]). While copying the parameter named "conv_2_batchnorm.weight", whose dimensions in the model are torch.Size([64]) and whose dimensions in the checkpoint are torch.Size([1, 64]). While copying the parameter named "conv_2_batchnorm.bias", whose dimensions in the model are torch.Size([64]) and whose dimensions in the checkpoint are torch.Size([1, 64]). While copying the parameter named "conv_2_batchnorm.running_mean", whose dimensions in the model are torch.Size([64]) and whose dimensions in the checkpoint are torch.Size([1, 64]). .......

michaelzhang917 avatar May 25 '18 21:05 michaelzhang917

Hi Yong Zhang,

I suspect that you are using the latest version of PyTorch (v0.4). Can you confirm this? We have encountered this same issue since upgrading. In the most recent commit (https://github.com/metalbubble/TRN-pytorch/commit/5005582e1b3acf3f901048052741df9b43bf3a4f), I have included links to download a new checkpoint file where these sizes of the batchnorm parameters have been corrected.

Unfortunately, this issue exists even in the base model checkpoint file. So even before you load the TRN pretrained model, the base BNInception/InceptionV3 checkpoints have this same issue. A quick fix for this is to simply comment out this line here: https://github.com/yjxiong/tensorflow-model-zoo.torch/blob/e31e0b7aa451e2c12c0107e616953a03d8cd0d47/bninception/pytorch_load.py#L35

This prevents the original model weights from being loaded into the model. This is not an issue since you will soon after overwrite those weights with new ones from the TRN state_dict. Give this a shot and let me know if this helps. It seem to solve the issue on our end.

Best, Alex A

alexandonian avatar May 26 '18 04:05 alexandonian

Thanks, Alex! It workd for the moment pretained model now. But I have the following errors for the jester pretrained model.

Could you take a look one more time?

Traceback (most recent call last): File "test_video.py", line 111, in net.load_state_dict(base_dict) File "/home/vbalab/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for TSN: Missing key(s) in state_dict: "base_model.conv_Conv2D.weight", "base_model.conv_Conv2D.bias", "base_model.conv_batchnorm.weight", "base_model.conv_batchnorm.bias", "base_model.conv_batchnorm.running_mean", "base_model.conv_batchnorm.running_var", "base_model.conv_1_Conv2D.weight", "base_model.conv_1_Conv2D.bias", "base_model.conv_1_batchnorm.weight", "base_model.conv_1_batchnorm.bias", "base_model.conv_1_batchnorm.running_mean", "base_model.conv_1_batchnorm.running_var", "base_model.conv_2_Conv2D.weight", "base_model.conv_2_Conv2D.bias", "base_model.conv_2_batchnorm.weight", "base_model.conv_2_batchnorm.bias", "base_model.conv_2_batchnorm.running_mean", "base_model.conv_2_batchnorm.running_var", "base_model.conv_3_Conv2D.weight", "base_model.conv_3_Conv2D.bias", "base_model.conv_3_batchnorm.weight", "base_model.conv_3_batchnorm.bias", "base_model.conv_3_batchnorm.running_mean", "base_model.conv_3_batchnorm.running_var", "base_model.conv_4_Conv2D.weight", "base_model.conv_4_Conv2D.bias", "base_model.conv_4_batchnorm.weight", "base_model.conv_4_batchnorm.bias", "base_model.conv_4_batchnorm.running_mean", "base_model.conv_4_batchnorm.running_var", "base_model.mixed_conv_Conv2D.weight", "base_model.mixed_conv_Conv2D.bias", "base_model.mixed_conv_batchnorm.weight", "base_model.mixed_conv_batchnorm.bias", "base_model.mixed_conv_batchnorm.running_mean", "base_model.mixed_conv_batchnorm.running_var", "base_model.mixed_tower_conv_Conv2D.weight", "base_model.mixed_tower_conv_Conv2D.bias", "base_model.mixed_tower_conv_batchnorm.weight", "base_model.mixed_tower_conv_batchnorm.bias", "base_model.mixed_tower_conv_batchnorm.running_mean", "base_model.mixed_tower_conv_batchnorm.running_var", "base_model.mixed_tower_conv_1_Conv2D.weight", "base_model.mixed_tower_conv_1_Conv2D.bias", "base_model.mixed_tower_conv_1_batchnorm.weight", "base_model.mixed_tower_conv_1_batchnorm.bias", "base_model.mixed_tower_conv_1_batchnorm.running_mean", "base_model.mixed_tower_conv_1_batchnorm.running_var", "base_model.mixed_tower_1_conv_Conv2D.weight", "base_model.mixed_tower_1_conv_Conv2D.bias", "base_model.mixed_tower_1_conv_batchnorm.weight", "base_model.mixed_tower_1_conv_batchnorm.bias", "base_model.mixed_tower_1_conv_batchnorm.running_mean", "base_model.mixed_tower_1_conv_batchnorm.running_var", "base_model.mixed_tower_1_conv_1_Conv2D.weight", "base_model.mixed_tower_1_conv_1_Conv2D.bias", "base_model.mixed_tower_1_conv_1_batchnorm.weight", "base_model.mixed_tower_1_conv_1_batchnorm.bias", "base_model.mixed_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_tower_1_conv_2_Conv2D.weight", "base_model.mixed_tower_1_conv_2_Conv2D.bias", "base_model.mixed_tower_1_conv_2_batchnorm.weight", "base_model.mixed_tower_1_conv_2_batchnorm.bias", "base_model.mixed_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_tower_2_conv_Conv2D.weight", "base_model.mixed_tower_2_conv_Conv2D.bias", "base_model.mixed_tower_2_conv_batchnorm.weight", "base_model.mixed_tower_2_conv_batchnorm.bias", "base_model.mixed_tower_2_conv_batchnorm.running_mean", "base_model.mixed_tower_2_conv_batchnorm.running_var", "base_model.mixed_1_conv_Conv2D.weight", "base_model.mixed_1_conv_Conv2D.bias", "base_model.mixed_1_conv_batchnorm.weight", "base_model.mixed_1_conv_batchnorm.bias", "base_model.mixed_1_conv_batchnorm.running_mean", "base_model.mixed_1_conv_batchnorm.running_var", "base_model.mixed_1_tower_conv_Conv2D.weight", "base_model.mixed_1_tower_conv_Conv2D.bias", "base_model.mixed_1_tower_conv_batchnorm.weight", "base_model.mixed_1_tower_conv_batchnorm.bias", "base_model.mixed_1_tower_conv_batchnorm.running_mean", "base_model.mixed_1_tower_conv_batchnorm.running_var", "base_model.mixed_1_tower_conv_1_Conv2D.weight", "base_model.mixed_1_tower_conv_1_Conv2D.bias", "base_model.mixed_1_tower_conv_1_batchnorm.weight", "base_model.mixed_1_tower_conv_1_batchnorm.bias", "base_model.mixed_1_tower_conv_1_batchnorm.running_mean", "base_model.mixed_1_tower_conv_1_batchnorm.running_var", "base_model.mixed_1_tower_1_conv_Conv2D.weight", "base_model.mixed_1_tower_1_conv_Conv2D.bias", "base_model.mixed_1_tower_1_conv_batchnorm.weight", "base_model.mixed_1_tower_1_conv_batchnorm.bias", "base_model.mixed_1_tower_1_conv_batchnorm.running_mean", "base_model.mixed_1_tower_1_conv_batchnorm.running_var", "base_model.mixed_1_tower_1_conv_1_Conv2D.weight", "base_model.mixed_1_tower_1_conv_1_Conv2D.bias", "base_model.mixed_1_tower_1_conv_1_batchnorm.weight", "base_model.mixed_1_tower_1_conv_1_batchnorm.bias", "base_model.mixed_1_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_1_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_1_tower_1_conv_2_Conv2D.weight", "base_model.mixed_1_tower_1_conv_2_Conv2D.bias", "base_model.mixed_1_tower_1_conv_2_batchnorm.weight", "base_model.mixed_1_tower_1_conv_2_batchnorm.bias", "base_model.mixed_1_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_1_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_1_tower_2_conv_Conv2D.weight", "base_model.mixed_1_tower_2_conv_Conv2D.bias", "base_model.mixed_1_tower_2_conv_batchnorm.weight", "base_model.mixed_1_tower_2_conv_batchnorm.bias", "base_model.mixed_1_tower_2_conv_batchnorm.running_mean", "base_model.mixed_1_tower_2_conv_batchnorm.running_var", "base_model.mixed_2_conv_Conv2D.weight", "base_model.mixed_2_conv_Conv2D.bias", "base_model.mixed_2_conv_batchnorm.weight", "base_model.mixed_2_conv_batchnorm.bias", "base_model.mixed_2_conv_batchnorm.running_mean", "base_model.mixed_2_conv_batchnorm.running_var", "base_model.mixed_2_tower_conv_Conv2D.weight", "base_model.mixed_2_tower_conv_Conv2D.bias", "base_model.mixed_2_tower_conv_batchnorm.weight", "base_model.mixed_2_tower_conv_batchnorm.bias", "base_model.mixed_2_tower_conv_batchnorm.running_mean", "base_model.mixed_2_tower_conv_batchnorm.running_var", "base_model.mixed_2_tower_conv_1_Conv2D.weight", "base_model.mixed_2_tower_conv_1_Conv2D.bias", "base_model.mixed_2_tower_conv_1_batchnorm.weight", "base_model.mixed_2_tower_conv_1_batchnorm.bias", "base_model.mixed_2_tower_conv_1_batchnorm.running_mean", "base_model.mixed_2_tower_conv_1_batchnorm.running_var", "base_model.mixed_2_tower_1_conv_Conv2D.weight", "base_model.mixed_2_tower_1_conv_Conv2D.bias", "base_model.mixed_2_tower_1_conv_batchnorm.weight", "base_model.mixed_2_tower_1_conv_batchnorm.bias", "base_model.mixed_2_tower_1_conv_batchnorm.running_mean", "base_model.mixed_2_tower_1_conv_batchnorm.running_var", "base_model.mixed_2_tower_1_conv_1_Conv2D.weight", "base_model.mixed_2_tower_1_conv_1_Conv2D.bias", "base_model.mixed_2_tower_1_conv_1_batchnorm.weight", "base_model.mixed_2_tower_1_conv_1_batchnorm.bias", "base_model.mixed_2_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_2_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_2_tower_1_conv_2_Conv2D.weight", "base_model.mixed_2_tower_1_conv_2_Conv2D.bias", "base_model.mixed_2_tower_1_conv_2_batchnorm.weight", "base_model.mixed_2_tower_1_conv_2_batchnorm.bias", "base_model.mixed_2_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_2_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_2_tower_2_conv_Conv2D.weight", "base_model.mixed_2_tower_2_conv_Conv2D.bias", "base_model.mixed_2_tower_2_conv_batchnorm.weight", "base_model.mixed_2_tower_2_conv_batchnorm.bias", "base_model.mixed_2_tower_2_conv_batchnorm.running_mean", "base_model.mixed_2_tower_2_conv_batchnorm.running_var", "base_model.mixed_3_conv_Conv2D.weight", "base_model.mixed_3_conv_Conv2D.bias", "base_model.mixed_3_conv_batchnorm.weight", "base_model.mixed_3_conv_batchnorm.bias", "base_model.mixed_3_conv_batchnorm.running_mean", "base_model.mixed_3_conv_batchnorm.running_var", "base_model.mixed_3_tower_conv_Conv2D.weight", "base_model.mixed_3_tower_conv_Conv2D.bias", "base_model.mixed_3_tower_conv_batchnorm.weight", "base_model.mixed_3_tower_conv_batchnorm.bias", "base_model.mixed_3_tower_conv_batchnorm.running_mean", "base_model.mixed_3_tower_conv_batchnorm.running_var", "base_model.mixed_3_tower_conv_1_Conv2D.weight", "base_model.mixed_3_tower_conv_1_Conv2D.bias", "base_model.mixed_3_tower_conv_1_batchnorm.weight", "base_model.mixed_3_tower_conv_1_batchnorm.bias", "base_model.mixed_3_tower_conv_1_batchnorm.running_mean", "base_model.mixed_3_tower_conv_1_batchnorm.running_var", "base_model.mixed_3_tower_conv_2_Conv2D.weight", "base_model.mixed_3_tower_conv_2_Conv2D.bias", "base_model.mixed_3_tower_conv_2_batchnorm.weight", "base_model.mixed_3_tower_conv_2_batchnorm.bias", "base_model.mixed_3_tower_conv_2_batchnorm.running_mean", "base_model.mixed_3_tower_conv_2_batchnorm.running_var", "base_model.mixed_4_conv_Conv2D.weight", "base_model.mixed_4_conv_Conv2D.bias", "base_model.mixed_4_conv_batchnorm.weight", "base_model.mixed_4_conv_batchnorm.bias", "base_model.mixed_4_conv_batchnorm.running_mean", "base_model.mixed_4_conv_batchnorm.running_var", "base_model.mixed_4_tower_conv_Conv2D.weight", "base_model.mixed_4_tower_conv_Conv2D.bias", "base_model.mixed_4_tower_conv_batchnorm.weight", "base_model.mixed_4_tower_conv_batchnorm.bias", "base_model.mixed_4_tower_conv_batchnorm.running_mean", "base_model.mixed_4_tower_conv_batchnorm.running_var", "base_model.mixed_4_tower_conv_1_Conv2D.weight", "base_model.mixed_4_tower_conv_1_Conv2D.bias", "base_model.mixed_4_tower_conv_1_batchnorm.weight", "base_model.mixed_4_tower_conv_1_batchnorm.bias", "base_model.mixed_4_tower_conv_1_batchnorm.running_mean", "base_model.mixed_4_tower_conv_1_batchnorm.running_var", "base_model.mixed_4_tower_conv_2_Conv2D.weight", "base_model.mixed_4_tower_conv_2_Conv2D.bias", "base_model.mixed_4_tower_conv_2_batchnorm.weight", "base_model.mixed_4_tower_conv_2_batchnorm.bias", "base_model.mixed_4_tower_conv_2_batchnorm.running_mean", "base_model.mixed_4_tower_conv_2_batchnorm.running_var", "base_model.mixed_4_tower_1_conv_Conv2D.weight", "base_model.mixed_4_tower_1_conv_Conv2D.bias", "base_model.mixed_4_tower_1_conv_batchnorm.weight", "base_model.mixed_4_tower_1_conv_batchnorm.bias", "base_model.mixed_4_tower_1_conv_batchnorm.running_mean", "base_model.mixed_4_tower_1_conv_batchnorm.running_var", "base_model.mixed_4_tower_1_conv_1_Conv2D.weight", "base_model.mixed_4_tower_1_conv_1_Conv2D.bias", "base_model.mixed_4_tower_1_conv_1_batchnorm.weight", "base_model.mixed_4_tower_1_conv_1_batchnorm.bias", "base_model.mixed_4_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_4_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_4_tower_1_conv_2_Conv2D.weight", "base_model.mixed_4_tower_1_conv_2_Conv2D.bias", "base_model.mixed_4_tower_1_conv_2_batchnorm.weight", "base_model.mixed_4_tower_1_conv_2_batchnorm.bias", "base_model.mixed_4_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_4_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_4_tower_1_conv_3_Conv2D.weight", "base_model.mixed_4_tower_1_conv_3_Conv2D.bias", "base_model.mixed_4_tower_1_conv_3_batchnorm.weight", "base_model.mixed_4_tower_1_conv_3_batchnorm.bias", "base_model.mixed_4_tower_1_conv_3_batchnorm.running_mean", "base_model.mixed_4_tower_1_conv_3_batchnorm.running_var", "base_model.mixed_4_tower_1_conv_4_Conv2D.weight", "base_model.mixed_4_tower_1_conv_4_Conv2D.bias", "base_model.mixed_4_tower_1_conv_4_batchnorm.weight", "base_model.mixed_4_tower_1_conv_4_batchnorm.bias", "base_model.mixed_4_tower_1_conv_4_batchnorm.running_mean", "base_model.mixed_4_tower_1_conv_4_batchnorm.running_var", "base_model.mixed_4_tower_2_conv_Conv2D.weight", "base_model.mixed_4_tower_2_conv_Conv2D.bias", "base_model.mixed_4_tower_2_conv_batchnorm.weight", "base_model.mixed_4_tower_2_conv_batchnorm.bias", "base_model.mixed_4_tower_2_conv_batchnorm.running_mean", "base_model.mixed_4_tower_2_conv_batchnorm.running_var", "base_model.mixed_5_conv_Conv2D.weight", "base_model.mixed_5_conv_Conv2D.bias", "base_model.mixed_5_conv_batchnorm.weight", "base_model.mixed_5_conv_batchnorm.bias", "base_model.mixed_5_conv_batchnorm.running_mean", "base_model.mixed_5_conv_batchnorm.running_var", "base_model.mixed_5_tower_conv_Conv2D.weight", "base_model.mixed_5_tower_conv_Conv2D.bias", "base_model.mixed_5_tower_conv_batchnorm.weight", "base_model.mixed_5_tower_conv_batchnorm.bias", "base_model.mixed_5_tower_conv_batchnorm.running_mean", "base_model.mixed_5_tower_conv_batchnorm.running_var", "base_model.mixed_5_tower_conv_1_Conv2D.weight", "base_model.mixed_5_tower_conv_1_Conv2D.bias", "base_model.mixed_5_tower_conv_1_batchnorm.weight", "base_model.mixed_5_tower_conv_1_batchnorm.bias", "base_model.mixed_5_tower_conv_1_batchnorm.running_mean", "base_model.mixed_5_tower_conv_1_batchnorm.running_var", "base_model.mixed_5_tower_conv_2_Conv2D.weight", "base_model.mixed_5_tower_conv_2_Conv2D.bias", "base_model.mixed_5_tower_conv_2_batchnorm.weight", "base_model.mixed_5_tower_conv_2_batchnorm.bias", "base_model.mixed_5_tower_conv_2_batchnorm.running_mean", "base_model.mixed_5_tower_conv_2_batchnorm.running_var", "base_model.mixed_5_tower_1_conv_Conv2D.weight", "base_model.mixed_5_tower_1_conv_Conv2D.bias", "base_model.mixed_5_tower_1_conv_batchnorm.weight", "base_model.mixed_5_tower_1_conv_batchnorm.bias", "base_model.mixed_5_tower_1_conv_batchnorm.running_mean", "base_model.mixed_5_tower_1_conv_batchnorm.running_var", "base_model.mixed_5_tower_1_conv_1_Conv2D.weight", "base_model.mixed_5_tower_1_conv_1_Conv2D.bias", "base_model.mixed_5_tower_1_conv_1_batchnorm.weight", "base_model.mixed_5_tower_1_conv_1_batchnorm.bias", "base_model.mixed_5_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_5_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_5_tower_1_conv_2_Conv2D.weight", "base_model.mixed_5_tower_1_conv_2_Conv2D.bias", "base_model.mixed_5_tower_1_conv_2_batchnorm.weight", "base_model.mixed_5_tower_1_conv_2_batchnorm.bias", "base_model.mixed_5_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_5_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_5_tower_1_conv_3_Conv2D.weight", "base_model.mixed_5_tower_1_conv_3_Conv2D.bias", "base_model.mixed_5_tower_1_conv_3_batchnorm.weight", "base_model.mixed_5_tower_1_conv_3_batchnorm.bias", "base_model.mixed_5_tower_1_conv_3_batchnorm.running_mean", "base_model.mixed_5_tower_1_conv_3_batchnorm.running_var", "base_model.mixed_5_tower_1_conv_4_Conv2D.weight", "base_model.mixed_5_tower_1_conv_4_Conv2D.bias", "base_model.mixed_5_tower_1_conv_4_batchnorm.weight", "base_model.mixed_5_tower_1_conv_4_batchnorm.bias", "base_model.mixed_5_tower_1_conv_4_batchnorm.running_mean", "base_model.mixed_5_tower_1_conv_4_batchnorm.running_var", "base_model.mixed_5_tower_2_conv_Conv2D.weight", "base_model.mixed_5_tower_2_conv_Conv2D.bias", "base_model.mixed_5_tower_2_conv_batchnorm.weight", "base_model.mixed_5_tower_2_conv_batchnorm.bias", "base_model.mixed_5_tower_2_conv_batchnorm.running_mean", "base_model.mixed_5_tower_2_conv_batchnorm.running_var", "base_model.mixed_6_conv_Conv2D.weight", "base_model.mixed_6_conv_Conv2D.bias", "base_model.mixed_6_conv_batchnorm.weight", "base_model.mixed_6_conv_batchnorm.bias", "base_model.mixed_6_conv_batchnorm.running_mean", "base_model.mixed_6_conv_batchnorm.running_var", "base_model.mixed_6_tower_conv_Conv2D.weight", "base_model.mixed_6_tower_conv_Conv2D.bias", "base_model.mixed_6_tower_conv_batchnorm.weight", "base_model.mixed_6_tower_conv_batchnorm.bias", "base_model.mixed_6_tower_conv_batchnorm.running_mean", "base_model.mixed_6_tower_conv_batchnorm.running_var", "base_model.mixed_6_tower_conv_1_Conv2D.weight", "base_model.mixed_6_tower_conv_1_Conv2D.bias", "base_model.mixed_6_tower_conv_1_batchnorm.weight", "base_model.mixed_6_tower_conv_1_batchnorm.bias", "base_model.mixed_6_tower_conv_1_batchnorm.running_mean", "base_model.mixed_6_tower_conv_1_batchnorm.running_var", "base_model.mixed_6_tower_conv_2_Conv2D.weight", "base_model.mixed_6_tower_conv_2_Conv2D.bias", "base_model.mixed_6_tower_conv_2_batchnorm.weight", "base_model.mixed_6_tower_conv_2_batchnorm.bias", "base_model.mixed_6_tower_conv_2_batchnorm.running_mean", "base_model.mixed_6_tower_conv_2_batchnorm.running_var", "base_model.mixed_6_tower_1_conv_Conv2D.weight", "base_model.mixed_6_tower_1_conv_Conv2D.bias", "base_model.mixed_6_tower_1_conv_batchnorm.weight", "base_model.mixed_6_tower_1_conv_batchnorm.bias", "base_model.mixed_6_tower_1_conv_batchnorm.running_mean", "base_model.mixed_6_tower_1_conv_batchnorm.running_var", "base_model.mixed_6_tower_1_conv_1_Conv2D.weight", "base_model.mixed_6_tower_1_conv_1_Conv2D.bias", "base_model.mixed_6_tower_1_conv_1_batchnorm.weight", "base_model.mixed_6_tower_1_conv_1_batchnorm.bias", "base_model.mixed_6_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_6_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_6_tower_1_conv_2_Conv2D.weight", "base_model.mixed_6_tower_1_conv_2_Conv2D.bias", "base_model.mixed_6_tower_1_conv_2_batchnorm.weight", "base_model.mixed_6_tower_1_conv_2_batchnorm.bias", "base_model.mixed_6_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_6_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_6_tower_1_conv_3_Conv2D.weight", "base_model.mixed_6_tower_1_conv_3_Conv2D.bias", "base_model.mixed_6_tower_1_conv_3_batchnorm.weight", "base_model.mixed_6_tower_1_conv_3_batchnorm.bias", "base_model.mixed_6_tower_1_conv_3_batchnorm.running_mean", "base_model.mixed_6_tower_1_conv_3_batchnorm.running_var", "base_model.mixed_6_tower_1_conv_4_Conv2D.weight", "base_model.mixed_6_tower_1_conv_4_Conv2D.bias", "base_model.mixed_6_tower_1_conv_4_batchnorm.weight", "base_model.mixed_6_tower_1_conv_4_batchnorm.bias", "base_model.mixed_6_tower_1_conv_4_batchnorm.running_mean", "base_model.mixed_6_tower_1_conv_4_batchnorm.running_var", "base_model.mixed_6_tower_2_conv_Conv2D.weight", "base_model.mixed_6_tower_2_conv_Conv2D.bias", "base_model.mixed_6_tower_2_conv_batchnorm.weight", "base_model.mixed_6_tower_2_conv_batchnorm.bias", "base_model.mixed_6_tower_2_conv_batchnorm.running_mean", "base_model.mixed_6_tower_2_conv_batchnorm.running_var", "base_model.mixed_7_conv_Conv2D.weight", "base_model.mixed_7_conv_Conv2D.bias", "base_model.mixed_7_conv_batchnorm.weight", "base_model.mixed_7_conv_batchnorm.bias", "base_model.mixed_7_conv_batchnorm.running_mean", "base_model.mixed_7_conv_batchnorm.running_var", "base_model.mixed_7_tower_conv_Conv2D.weight", "base_model.mixed_7_tower_conv_Conv2D.bias", "base_model.mixed_7_tower_conv_batchnorm.weight", "base_model.mixed_7_tower_conv_batchnorm.bias", "base_model.mixed_7_tower_conv_batchnorm.running_mean", "base_model.mixed_7_tower_conv_batchnorm.running_var", "base_model.mixed_7_tower_conv_1_Conv2D.weight", "base_model.mixed_7_tower_conv_1_Conv2D.bias", "base_model.mixed_7_tower_conv_1_batchnorm.weight", "base_model.mixed_7_tower_conv_1_batchnorm.bias", "base_model.mixed_7_tower_conv_1_batchnorm.running_mean", "base_model.mixed_7_tower_conv_1_batchnorm.running_var", "base_model.mixed_7_tower_conv_2_Conv2D.weight", "base_model.mixed_7_tower_conv_2_Conv2D.bias", "base_model.mixed_7_tower_conv_2_batchnorm.weight", "base_model.mixed_7_tower_conv_2_batchnorm.bias", "base_model.mixed_7_tower_conv_2_batchnorm.running_mean", "base_model.mixed_7_tower_conv_2_batchnorm.running_var", "base_model.mixed_7_tower_1_conv_Conv2D.weight", "base_model.mixed_7_tower_1_conv_Conv2D.bias", "base_model.mixed_7_tower_1_conv_batchnorm.weight", "base_model.mixed_7_tower_1_conv_batchnorm.bias", "base_model.mixed_7_tower_1_conv_batchnorm.running_mean", "base_model.mixed_7_tower_1_conv_batchnorm.running_var", "base_model.mixed_7_tower_1_conv_1_Conv2D.weight", "base_model.mixed_7_tower_1_conv_1_Conv2D.bias", "base_model.mixed_7_tower_1_conv_1_batchnorm.weight", "base_model.mixed_7_tower_1_conv_1_batchnorm.bias", "base_model.mixed_7_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_7_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_7_tower_1_conv_2_Conv2D.weight", "base_model.mixed_7_tower_1_conv_2_Conv2D.bias", "base_model.mixed_7_tower_1_conv_2_batchnorm.weight", "base_model.mixed_7_tower_1_conv_2_batchnorm.bias", "base_model.mixed_7_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_7_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_7_tower_1_conv_3_Conv2D.weight", "base_model.mixed_7_tower_1_conv_3_Conv2D.bias", "base_model.mixed_7_tower_1_conv_3_batchnorm.weight", "base_model.mixed_7_tower_1_conv_3_batchnorm.bias", "base_model.mixed_7_tower_1_conv_3_batchnorm.running_mean", "base_model.mixed_7_tower_1_conv_3_batchnorm.running_var", "base_model.mixed_7_tower_1_conv_4_Conv2D.weight", "base_model.mixed_7_tower_1_conv_4_Conv2D.bias", "base_model.mixed_7_tower_1_conv_4_batchnorm.weight", "base_model.mixed_7_tower_1_conv_4_batchnorm.bias", "base_model.mixed_7_tower_1_conv_4_batchnorm.running_mean", "base_model.mixed_7_tower_1_conv_4_batchnorm.running_var", "base_model.mixed_7_tower_2_conv_Conv2D.weight", "base_model.mixed_7_tower_2_conv_Conv2D.bias", "base_model.mixed_7_tower_2_conv_batchnorm.weight", "base_model.mixed_7_tower_2_conv_batchnorm.bias", "base_model.mixed_7_tower_2_conv_batchnorm.running_mean", "base_model.mixed_7_tower_2_conv_batchnorm.running_var", "base_model.mixed_8_tower_conv_Conv2D.weight", "base_model.mixed_8_tower_conv_Conv2D.bias", "base_model.mixed_8_tower_conv_batchnorm.weight", "base_model.mixed_8_tower_conv_batchnorm.bias", "base_model.mixed_8_tower_conv_batchnorm.running_mean", "base_model.mixed_8_tower_conv_batchnorm.running_var", "base_model.mixed_8_tower_conv_1_Conv2D.weight", "base_model.mixed_8_tower_conv_1_Conv2D.bias", "base_model.mixed_8_tower_conv_1_batchnorm.weight", "base_model.mixed_8_tower_conv_1_batchnorm.bias", "base_model.mixed_8_tower_conv_1_batchnorm.running_mean", "base_model.mixed_8_tower_conv_1_batchnorm.running_var", "base_model.mixed_8_tower_1_conv_Conv2D.weight", "base_model.mixed_8_tower_1_conv_Conv2D.bias", "base_model.mixed_8_tower_1_conv_batchnorm.weight", "base_model.mixed_8_tower_1_conv_batchnorm.bias", "base_model.mixed_8_tower_1_conv_batchnorm.running_mean", "base_model.mixed_8_tower_1_conv_batchnorm.running_var", "base_model.mixed_8_tower_1_conv_1_Conv2D.weight", "base_model.mixed_8_tower_1_conv_1_Conv2D.bias", "base_model.mixed_8_tower_1_conv_1_batchnorm.weight", "base_model.mixed_8_tower_1_conv_1_batchnorm.bias", "base_model.mixed_8_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_8_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_8_tower_1_conv_2_Conv2D.weight", "base_model.mixed_8_tower_1_conv_2_Conv2D.bias", "base_model.mixed_8_tower_1_conv_2_batchnorm.weight", "base_model.mixed_8_tower_1_conv_2_batchnorm.bias", "base_model.mixed_8_tower_1_conv_2_batchnorm.running_mean", "base_model.mixed_8_tower_1_conv_2_batchnorm.running_var", "base_model.mixed_8_tower_1_conv_3_Conv2D.weight", "base_model.mixed_8_tower_1_conv_3_Conv2D.bias", "base_model.mixed_8_tower_1_conv_3_batchnorm.weight", "base_model.mixed_8_tower_1_conv_3_batchnorm.bias", "base_model.mixed_8_tower_1_conv_3_batchnorm.running_mean", "base_model.mixed_8_tower_1_conv_3_batchnorm.running_var", "base_model.mixed_9_conv_Conv2D.weight", "base_model.mixed_9_conv_Conv2D.bias", "base_model.mixed_9_conv_batchnorm.weight", "base_model.mixed_9_conv_batchnorm.bias", "base_model.mixed_9_conv_batchnorm.running_mean", "base_model.mixed_9_conv_batchnorm.running_var", "base_model.mixed_9_tower_conv_Conv2D.weight", "base_model.mixed_9_tower_conv_Conv2D.bias", "base_model.mixed_9_tower_conv_batchnorm.weight", "base_model.mixed_9_tower_conv_batchnorm.bias", "base_model.mixed_9_tower_conv_batchnorm.running_mean", "base_model.mixed_9_tower_conv_batchnorm.running_var", "base_model.mixed_9_tower_mixed_conv_Conv2D.weight", "base_model.mixed_9_tower_mixed_conv_Conv2D.bias", "base_model.mixed_9_tower_mixed_conv_batchnorm.weight", "base_model.mixed_9_tower_mixed_conv_batchnorm.bias", "base_model.mixed_9_tower_mixed_conv_batchnorm.running_mean", "base_model.mixed_9_tower_mixed_conv_batchnorm.running_var", "base_model.mixed_9_tower_mixed_conv_1_Conv2D.weight", "base_model.mixed_9_tower_mixed_conv_1_Conv2D.bias", "base_model.mixed_9_tower_mixed_conv_1_batchnorm.weight", "base_model.mixed_9_tower_mixed_conv_1_batchnorm.bias", "base_model.mixed_9_tower_mixed_conv_1_batchnorm.running_mean", "base_model.mixed_9_tower_mixed_conv_1_batchnorm.running_var", "base_model.mixed_9_tower_1_conv_Conv2D.weight", "base_model.mixed_9_tower_1_conv_Conv2D.bias", "base_model.mixed_9_tower_1_conv_batchnorm.weight", "base_model.mixed_9_tower_1_conv_batchnorm.bias", "base_model.mixed_9_tower_1_conv_batchnorm.running_mean", "base_model.mixed_9_tower_1_conv_batchnorm.running_var", "base_model.mixed_9_tower_1_conv_1_Conv2D.weight", "base_model.mixed_9_tower_1_conv_1_Conv2D.bias", "base_model.mixed_9_tower_1_conv_1_batchnorm.weight", "base_model.mixed_9_tower_1_conv_1_batchnorm.bias", "base_model.mixed_9_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_9_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_9_tower_1_mixed_conv_Conv2D.weight", "base_model.mixed_9_tower_1_mixed_conv_Conv2D.bias", "base_model.mixed_9_tower_1_mixed_conv_batchnorm.weight", "base_model.mixed_9_tower_1_mixed_conv_batchnorm.bias", "base_model.mixed_9_tower_1_mixed_conv_batchnorm.running_mean", "base_model.mixed_9_tower_1_mixed_conv_batchnorm.running_var", "base_model.mixed_9_tower_1_mixed_conv_1_Conv2D.weight", "base_model.mixed_9_tower_1_mixed_conv_1_Conv2D.bias", "base_model.mixed_9_tower_1_mixed_conv_1_batchnorm.weight", "base_model.mixed_9_tower_1_mixed_conv_1_batchnorm.bias", "base_model.mixed_9_tower_1_mixed_conv_1_batchnorm.running_mean", "base_model.mixed_9_tower_1_mixed_conv_1_batchnorm.running_var", "base_model.mixed_9_tower_2_conv_Conv2D.weight", "base_model.mixed_9_tower_2_conv_Conv2D.bias", "base_model.mixed_9_tower_2_conv_batchnorm.weight", "base_model.mixed_9_tower_2_conv_batchnorm.bias", "base_model.mixed_9_tower_2_conv_batchnorm.running_mean", "base_model.mixed_9_tower_2_conv_batchnorm.running_var", "base_model.mixed_10_conv_Conv2D.weight", "base_model.mixed_10_conv_Conv2D.bias", "base_model.mixed_10_conv_batchnorm.weight", "base_model.mixed_10_conv_batchnorm.bias", "base_model.mixed_10_conv_batchnorm.running_mean", "base_model.mixed_10_conv_batchnorm.running_var", "base_model.mixed_10_tower_conv_Conv2D.weight", "base_model.mixed_10_tower_conv_Conv2D.bias", "base_model.mixed_10_tower_conv_batchnorm.weight", "base_model.mixed_10_tower_conv_batchnorm.bias", "base_model.mixed_10_tower_conv_batchnorm.running_mean", "base_model.mixed_10_tower_conv_batchnorm.running_var", "base_model.mixed_10_tower_mixed_conv_Conv2D.weight", "base_model.mixed_10_tower_mixed_conv_Conv2D.bias", "base_model.mixed_10_tower_mixed_conv_batchnorm.weight", "base_model.mixed_10_tower_mixed_conv_batchnorm.bias", "base_model.mixed_10_tower_mixed_conv_batchnorm.running_mean", "base_model.mixed_10_tower_mixed_conv_batchnorm.running_var", "base_model.mixed_10_tower_mixed_conv_1_Conv2D.weight", "base_model.mixed_10_tower_mixed_conv_1_Conv2D.bias", "base_model.mixed_10_tower_mixed_conv_1_batchnorm.weight", "base_model.mixed_10_tower_mixed_conv_1_batchnorm.bias", "base_model.mixed_10_tower_mixed_conv_1_batchnorm.running_mean", "base_model.mixed_10_tower_mixed_conv_1_batchnorm.running_var", "base_model.mixed_10_tower_1_conv_Conv2D.weight", "base_model.mixed_10_tower_1_conv_Conv2D.bias", "base_model.mixed_10_tower_1_conv_batchnorm.weight", "base_model.mixed_10_tower_1_conv_batchnorm.bias", "base_model.mixed_10_tower_1_conv_batchnorm.running_mean", "base_model.mixed_10_tower_1_conv_batchnorm.running_var", "base_model.mixed_10_tower_1_conv_1_Conv2D.weight", "base_model.mixed_10_tower_1_conv_1_Conv2D.bias", "base_model.mixed_10_tower_1_conv_1_batchnorm.weight", "base_model.mixed_10_tower_1_conv_1_batchnorm.bias", "base_model.mixed_10_tower_1_conv_1_batchnorm.running_mean", "base_model.mixed_10_tower_1_conv_1_batchnorm.running_var", "base_model.mixed_10_tower_1_mixed_conv_Conv2D.weight", "base_model.mixed_10_tower_1_mixed_conv_Conv2D.bias", "base_model.mixed_10_tower_1_mixed_conv_batchnorm.weight", "base_model.mixed_10_tower_1_mixed_conv_batchnorm.bias", "base_model.mixed_10_tower_1_mixed_conv_batchnorm.running_mean", "base_model.mixed_10_tower_1_mixed_conv_batchnorm.running_var", "base_model.mixed_10_tower_1_mixed_conv_1_Conv2D.weight", "base_model.mixed_10_tower_1_mixed_conv_1_Conv2D.bias", "base_model.mixed_10_tower_1_mixed_conv_1_batchnorm.weight", "base_model.mixed_10_tower_1_mixed_conv_1_batchnorm.bias", "base_model.mixed_10_tower_1_mixed_conv_1_batchnorm.running_mean", "base_model.mixed_10_tower_1_mixed_conv_1_batchnorm.running_var", "base_model.mixed_10_tower_2_conv_Conv2D.weight", "base_model.mixed_10_tower_2_conv_Conv2D.bias", "base_model.mixed_10_tower_2_conv_batchnorm.weight", "base_model.mixed_10_tower_2_conv_batchnorm.bias", "base_model.mixed_10_tower_2_conv_batchnorm.running_mean", "base_model.mixed_10_tower_2_conv_batchnorm.running_var". Unexpected key(s) in state_dict: "base_model.conv1_7x7_s2.weight", "base_model.conv1_7x7_s2.bias", "base_model.conv1_7x7_s2_bn.weight", "base_model.conv1_7x7_s2_bn.bias", "base_model.conv1_7x7_s2_bn.running_mean", "base_model.conv1_7x7_s2_bn.running_var", "base_model.conv2_3x3_reduce.weight", "base_model.conv2_3x3_reduce.bias", "base_model.conv2_3x3_reduce_bn.weight", "base_model.conv2_3x3_reduce_bn.bias", "base_model.conv2_3x3_reduce_bn.running_mean", "base_model.conv2_3x3_reduce_bn.running_var", "base_model.conv2_3x3.weight", "base_model.conv2_3x3.bias", "base_model.conv2_3x3_bn.weight", "base_model.conv2_3x3_bn.bias", "base_model.conv2_3x3_bn.running_mean", "base_model.conv2_3x3_bn.running_var", "base_model.inception_3a_1x1.weight", "base_model.inception_3a_1x1.bias", "base_model.inception_3a_1x1_bn.weight", "base_model.inception_3a_1x1_bn.bias", "base_model.inception_3a_1x1_bn.running_mean", "base_model.inception_3a_1x1_bn.running_var", "base_model.inception_3a_3x3_reduce.weight", "base_model.inception_3a_3x3_reduce.bias", "base_model.inception_3a_3x3_reduce_bn.weight", "base_model.inception_3a_3x3_reduce_bn.bias", "base_model.inception_3a_3x3_reduce_bn.running_mean", "base_model.inception_3a_3x3_reduce_bn.running_var", "base_model.inception_3a_3x3.weight", "base_model.inception_3a_3x3.bias", "base_model.inception_3a_3x3_bn.weight", "base_model.inception_3a_3x3_bn.bias", "base_model.inception_3a_3x3_bn.running_mean", "base_model.inception_3a_3x3_bn.running_var", "base_model.inception_3a_double_3x3_reduce.weight", "base_model.inception_3a_double_3x3_reduce.bias", "base_model.inception_3a_double_3x3_reduce_bn.weight", "base_model.inception_3a_double_3x3_reduce_bn.bias", "base_model.inception_3a_double_3x3_reduce_bn.running_mean", "base_model.inception_3a_double_3x3_reduce_bn.running_var", "base_model.inception_3a_double_3x3_1.weight", "base_model.inception_3a_double_3x3_1.bias", "base_model.inception_3a_double_3x3_1_bn.weight", "base_model.inception_3a_double_3x3_1_bn.bias", "base_model.inception_3a_double_3x3_1_bn.running_mean", "base_model.inception_3a_double_3x3_1_bn.running_var", "base_model.inception_3a_double_3x3_2.weight", "base_model.inception_3a_double_3x3_2.bias", "base_model.inception_3a_double_3x3_2_bn.weight", "base_model.inception_3a_double_3x3_2_bn.bias", "base_model.inception_3a_double_3x3_2_bn.running_mean", "base_model.inception_3a_double_3x3_2_bn.running_var", "base_model.inception_3a_pool_proj.weight", "base_model.inception_3a_pool_proj.bias", "base_model.inception_3a_pool_proj_bn.weight", "base_model.inception_3a_pool_proj_bn.bias", "base_model.inception_3a_pool_proj_bn.running_mean", "base_model.inception_3a_pool_proj_bn.running_var", "base_model.inception_3b_1x1.weight", "base_model.inception_3b_1x1.bias", "base_model.inception_3b_1x1_bn.weight", "base_model.inception_3b_1x1_bn.bias", "base_model.inception_3b_1x1_bn.running_mean", "base_model.inception_3b_1x1_bn.running_var", "base_model.inception_3b_3x3_reduce.weight", "base_model.inception_3b_3x3_reduce.bias", "base_model.inception_3b_3x3_reduce_bn.weight", "base_model.inception_3b_3x3_reduce_bn.bias", "base_model.inception_3b_3x3_reduce_bn.running_mean", "base_model.inception_3b_3x3_reduce_bn.running_var", "base_model.inception_3b_3x3.weight", "base_model.inception_3b_3x3.bias", "base_model.inception_3b_3x3_bn.weight", "base_model.inception_3b_3x3_bn.bias", "base_model.inception_3b_3x3_bn.running_mean", "base_model.inception_3b_3x3_bn.running_var", "base_model.inception_3b_double_3x3_reduce.weight", "base_model.inception_3b_double_3x3_reduce.bias", "base_model.inception_3b_double_3x3_reduce_bn.weight", "base_model.inception_3b_double_3x3_reduce_bn.bias", "base_model.inception_3b_double_3x3_reduce_bn.running_mean", "base_model.inception_3b_double_3x3_reduce_bn.running_var", "base_model.inception_3b_double_3x3_1.weight", "base_model.inception_3b_double_3x3_1.bias", "base_model.inception_3b_double_3x3_1_bn.weight", "base_model.inception_3b_double_3x3_1_bn.bias", "base_model.inception_3b_double_3x3_1_bn.running_mean", "base_model.inception_3b_double_3x3_1_bn.running_var", "base_model.inception_3b_double_3x3_2.weight", "base_model.inception_3b_double_3x3_2.bias", "base_model.inception_3b_double_3x3_2_bn.weight", "base_model.inception_3b_double_3x3_2_bn.bias", "base_model.inception_3b_double_3x3_2_bn.running_mean", "base_model.inception_3b_double_3x3_2_bn.running_var", "base_model.inception_3b_pool_proj.weight", "base_model.inception_3b_pool_proj.bias", "base_model.inception_3b_pool_proj_bn.weight", "base_model.inception_3b_pool_proj_bn.bias", "base_model.inception_3b_pool_proj_bn.running_mean", "base_model.inception_3b_pool_proj_bn.running_var", "base_model.inception_3c_3x3_reduce.weight", "base_model.inception_3c_3x3_reduce.bias", "base_model.inception_3c_3x3_reduce_bn.weight", "base_model.inception_3c_3x3_reduce_bn.bias", "base_model.inception_3c_3x3_reduce_bn.running_mean", "base_model.inception_3c_3x3_reduce_bn.running_var", "base_model.inception_3c_3x3.weight", "base_model.inception_3c_3x3.bias", "base_model.inception_3c_3x3_bn.weight", "base_model.inception_3c_3x3_bn.bias", "base_model.inception_3c_3x3_bn.running_mean", "base_model.inception_3c_3x3_bn.running_var", "base_model.inception_3c_double_3x3_reduce.weight", "base_model.inception_3c_double_3x3_reduce.bias", "base_model.inception_3c_double_3x3_reduce_bn.weight", "base_model.inception_3c_double_3x3_reduce_bn.bias", "base_model.inception_3c_double_3x3_reduce_bn.running_mean", "base_model.inception_3c_double_3x3_reduce_bn.running_var", "base_model.inception_3c_double_3x3_1.weight", "base_model.inception_3c_double_3x3_1.bias", "base_model.inception_3c_double_3x3_1_bn.weight", "base_model.inception_3c_double_3x3_1_bn.bias", "base_model.inception_3c_double_3x3_1_bn.running_mean", "base_model.inception_3c_double_3x3_1_bn.running_var", "base_model.inception_3c_double_3x3_2.weight", "base_model.inception_3c_double_3x3_2.bias", "base_model.inception_3c_double_3x3_2_bn.weight", "base_model.inception_3c_double_3x3_2_bn.bias", "base_model.inception_3c_double_3x3_2_bn.running_mean", "base_model.inception_3c_double_3x3_2_bn.running_var", "base_model.inception_4a_1x1.weight", "base_model.inception_4a_1x1.bias", "base_model.inception_4a_1x1_bn.weight", "base_model.inception_4a_1x1_bn.bias", "base_model.inception_4a_1x1_bn.running_mean", "base_model.inception_4a_1x1_bn.running_var", "base_model.inception_4a_3x3_reduce.weight", "base_model.inception_4a_3x3_reduce.bias", "base_model.inception_4a_3x3_reduce_bn.weight", "base_model.inception_4a_3x3_reduce_bn.bias", "base_model.inception_4a_3x3_reduce_bn.running_mean", "base_model.inception_4a_3x3_reduce_bn.running_var", "base_model.inception_4a_3x3.weight", "base_model.inception_4a_3x3.bias", "base_model.inception_4a_3x3_bn.weight", "base_model.inception_4a_3x3_bn.bias", "base_model.inception_4a_3x3_bn.running_mean", "base_model.inception_4a_3x3_bn.running_var", "base_model.inception_4a_double_3x3_reduce.weight", "base_model.inception_4a_double_3x3_reduce.bias", "base_model.inception_4a_double_3x3_reduce_bn.weight", "base_model.inception_4a_double_3x3_reduce_bn.bias", "base_model.inception_4a_double_3x3_reduce_bn.running_mean", "base_model.inception_4a_double_3x3_reduce_bn.running_var", "base_model.inception_4a_double_3x3_1.weight", "base_model.inception_4a_double_3x3_1.bias", "base_model.inception_4a_double_3x3_1_bn.weight", "base_model.inception_4a_double_3x3_1_bn.bias", "base_model.inception_4a_double_3x3_1_bn.running_mean", "base_model.inception_4a_double_3x3_1_bn.running_var", "base_model.inception_4a_double_3x3_2.weight", "base_model.inception_4a_double_3x3_2.bias", "base_model.inception_4a_double_3x3_2_bn.weight", "base_model.inception_4a_double_3x3_2_bn.bias", "base_model.inception_4a_double_3x3_2_bn.running_mean", "base_model.inception_4a_double_3x3_2_bn.running_var", "base_model.inception_4a_pool_proj.weight", "base_model.inception_4a_pool_proj.bias", "base_model.inception_4a_pool_proj_bn.weight", "base_model.inception_4a_pool_proj_bn.bias", "base_model.inception_4a_pool_proj_bn.running_mean", "base_model.inception_4a_pool_proj_bn.running_var", "base_model.inception_4b_1x1.weight", "base_model.inception_4b_1x1.bias", "base_model.inception_4b_1x1_bn.weight", "base_model.inception_4b_1x1_bn.bias", "base_model.inception_4b_1x1_bn.running_mean", "base_model.inception_4b_1x1_bn.running_var", "base_model.inception_4b_3x3_reduce.weight", "base_model.inception_4b_3x3_reduce.bias", "base_model.inception_4b_3x3_reduce_bn.weight", "base_model.inception_4b_3x3_reduce_bn.bias", "base_model.inception_4b_3x3_reduce_bn.running_mean", "base_model.inception_4b_3x3_reduce_bn.running_var", "base_model.inception_4b_3x3.weight", "base_model.inception_4b_3x3.bias", "base_model.inception_4b_3x3_bn.weight", "base_model.inception_4b_3x3_bn.bias", "base_model.inception_4b_3x3_bn.running_mean", "base_model.inception_4b_3x3_bn.running_var", "base_model.inception_4b_double_3x3_reduce.weight", "base_model.inception_4b_double_3x3_reduce.bias", "base_model.inception_4b_double_3x3_reduce_bn.weight", "base_model.inception_4b_double_3x3_reduce_bn.bias", "base_model.inception_4b_double_3x3_reduce_bn.running_mean", "base_model.inception_4b_double_3x3_reduce_bn.running_var", "base_model.inception_4b_double_3x3_1.weight", "base_model.inception_4b_double_3x3_1.bias", "base_model.inception_4b_double_3x3_1_bn.weight", "base_model.inception_4b_double_3x3_1_bn.bias", "base_model.inception_4b_double_3x3_1_bn.running_mean", "base_model.inception_4b_double_3x3_1_bn.running_var", "base_model.inception_4b_double_3x3_2.weight", "base_model.inception_4b_double_3x3_2.bias", "base_model.inception_4b_double_3x3_2_bn.weight", "base_model.inception_4b_double_3x3_2_bn.bias", "base_model.inception_4b_double_3x3_2_bn.running_mean", "base_model.inception_4b_double_3x3_2_bn.running_var", "base_model.inception_4b_pool_proj.weight", "base_model.inception_4b_pool_proj.bias", "base_model.inception_4b_pool_proj_bn.weight", "base_model.inception_4b_pool_proj_bn.bias", "base_model.inception_4b_pool_proj_bn.running_mean", "base_model.inception_4b_pool_proj_bn.running_var", "base_model.inception_4c_1x1.weight", "base_model.inception_4c_1x1.bias", "base_model.inception_4c_1x1_bn.weight", "base_model.inception_4c_1x1_bn.bias", "base_model.inception_4c_1x1_bn.running_mean", "base_model.inception_4c_1x1_bn.running_var", "base_model.inception_4c_3x3_reduce.weight", "base_model.inception_4c_3x3_reduce.bias", "base_model.inception_4c_3x3_reduce_bn.weight", "base_model.inception_4c_3x3_reduce_bn.bias", "base_model.inception_4c_3x3_reduce_bn.running_mean", "base_model.inception_4c_3x3_reduce_bn.running_var", "base_model.inception_4c_3x3.weight", "base_model.inception_4c_3x3.bias", "base_model.inception_4c_3x3_bn.weight", "base_model.inception_4c_3x3_bn.bias", "base_model.inception_4c_3x3_bn.running_mean", "base_model.inception_4c_3x3_bn.running_var", "base_model.inception_4c_double_3x3_reduce.weight", "base_model.inception_4c_double_3x3_reduce.bias", "base_model.inception_4c_double_3x3_reduce_bn.weight", "base_model.inception_4c_double_3x3_reduce_bn.bias", "base_model.inception_4c_double_3x3_reduce_bn.running_mean", "base_model.inception_4c_double_3x3_reduce_bn.running_var", "base_model.inception_4c_double_3x3_1.weight", "base_model.inception_4c_double_3x3_1.bias", "base_model.inception_4c_double_3x3_1_bn.weight", "base_model.inception_4c_double_3x3_1_bn.bias", "base_model.inception_4c_double_3x3_1_bn.running_mean", "base_model.inception_4c_double_3x3_1_bn.running_var", "base_model.inception_4c_double_3x3_2.weight", "base_model.inception_4c_double_3x3_2.bias", "base_model.inception_4c_double_3x3_2_bn.weight", "base_model.inception_4c_double_3x3_2_bn.bias", "base_model.inception_4c_double_3x3_2_bn.running_mean", "base_model.inception_4c_double_3x3_2_bn.running_var", "base_model.inception_4c_pool_proj.weight", "base_model.inception_4c_pool_proj.bias", "base_model.inception_4c_pool_proj_bn.weight", "base_model.inception_4c_pool_proj_bn.bias", "base_model.inception_4c_pool_proj_bn.running_mean", "base_model.inception_4c_pool_proj_bn.running_var", "base_model.inception_4d_1x1.weight", "base_model.inception_4d_1x1.bias", "base_model.inception_4d_1x1_bn.weight", "base_model.inception_4d_1x1_bn.bias", "base_model.inception_4d_1x1_bn.running_mean", "base_model.inception_4d_1x1_bn.running_var", "base_model.inception_4d_3x3_reduce.weight", "base_model.inception_4d_3x3_reduce.bias", "base_model.inception_4d_3x3_reduce_bn.weight", "base_model.inception_4d_3x3_reduce_bn.bias", "base_model.inception_4d_3x3_reduce_bn.running_mean", "base_model.inception_4d_3x3_reduce_bn.running_var", "base_model.inception_4d_3x3.weight", "base_model.inception_4d_3x3.bias", "base_model.inception_4d_3x3_bn.weight", "base_model.inception_4d_3x3_bn.bias", "base_model.inception_4d_3x3_bn.running_mean", "base_model.inception_4d_3x3_bn.running_var", "base_model.inception_4d_double_3x3_reduce.weight", "base_model.inception_4d_double_3x3_reduce.bias", "base_model.inception_4d_double_3x3_reduce_bn.weight", "base_model.inception_4d_double_3x3_reduce_bn.bias", "base_model.inception_4d_double_3x3_reduce_bn.running_mean", "base_model.inception_4d_double_3x3_reduce_bn.running_var", "base_model.inception_4d_double_3x3_1.weight", "base_model.inception_4d_double_3x3_1.bias", "base_model.inception_4d_double_3x3_1_bn.weight", "base_model.inception_4d_double_3x3_1_bn.bias", "base_model.inception_4d_double_3x3_1_bn.running_mean", "base_model.inception_4d_double_3x3_1_bn.running_var", "base_model.inception_4d_double_3x3_2.weight", "base_model.inception_4d_double_3x3_2.bias", "base_model.inception_4d_double_3x3_2_bn.weight", "base_model.inception_4d_double_3x3_2_bn.bias", "base_model.inception_4d_double_3x3_2_bn.running_mean", "base_model.inception_4d_double_3x3_2_bn.running_var", "base_model.inception_4d_pool_proj.weight", "base_model.inception_4d_pool_proj.bias", "base_model.inception_4d_pool_proj_bn.weight", "base_model.inception_4d_pool_proj_bn.bias", "base_model.inception_4d_pool_proj_bn.running_mean", "base_model.inception_4d_pool_proj_bn.running_var", "base_model.inception_4e_3x3_reduce.weight", "base_model.inception_4e_3x3_reduce.bias", "base_model.inception_4e_3x3_reduce_bn.weight", "base_model.inception_4e_3x3_reduce_bn.bias", "base_model.inception_4e_3x3_reduce_bn.running_mean", "base_model.inception_4e_3x3_reduce_bn.running_var", "base_model.inception_4e_3x3.weight", "base_model.inception_4e_3x3.bias", "base_model.inception_4e_3x3_bn.weight", "base_model.inception_4e_3x3_bn.bias", "base_model.inception_4e_3x3_bn.running_mean", "base_model.inception_4e_3x3_bn.running_var", "base_model.inception_4e_double_3x3_reduce.weight", "base_model.inception_4e_double_3x3_reduce.bias", "base_model.inception_4e_double_3x3_reduce_bn.weight", "base_model.inception_4e_double_3x3_reduce_bn.bias", "base_model.inception_4e_double_3x3_reduce_bn.running_mean", "base_model.inception_4e_double_3x3_reduce_bn.running_var", "base_model.inception_4e_double_3x3_1.weight", "base_model.inception_4e_double_3x3_1.bias", "base_model.inception_4e_double_3x3_1_bn.weight", "base_model.inception_4e_double_3x3_1_bn.bias", "base_model.inception_4e_double_3x3_1_bn.running_mean", "base_model.inception_4e_double_3x3_1_bn.running_var", "base_model.inception_4e_double_3x3_2.weight", "base_model.inception_4e_double_3x3_2.bias", "base_model.inception_4e_double_3x3_2_bn.weight", "base_model.inception_4e_double_3x3_2_bn.bias", "base_model.inception_4e_double_3x3_2_bn.running_mean", "base_model.inception_4e_double_3x3_2_bn.running_var", "base_model.inception_5a_1x1.weight", "base_model.inception_5a_1x1.bias", "base_model.inception_5a_1x1_bn.weight", "base_model.inception_5a_1x1_bn.bias", "base_model.inception_5a_1x1_bn.running_mean", "base_model.inception_5a_1x1_bn.running_var", "base_model.inception_5a_3x3_reduce.weight", "base_model.inception_5a_3x3_reduce.bias", "base_model.inception_5a_3x3_reduce_bn.weight", "base_model.inception_5a_3x3_reduce_bn.bias", "base_model.inception_5a_3x3_reduce_bn.running_mean", "base_model.inception_5a_3x3_reduce_bn.running_var", "base_model.inception_5a_3x3.weight", "base_model.inception_5a_3x3.bias", "base_model.inception_5a_3x3_bn.weight", "base_model.inception_5a_3x3_bn.bias", "base_model.inception_5a_3x3_bn.running_mean", "base_model.inception_5a_3x3_bn.running_var", "base_model.inception_5a_double_3x3_reduce.weight", "base_model.inception_5a_double_3x3_reduce.bias", "base_model.inception_5a_double_3x3_reduce_bn.weight", "base_model.inception_5a_double_3x3_reduce_bn.bias", "base_model.inception_5a_double_3x3_reduce_bn.running_mean", "base_model.inception_5a_double_3x3_reduce_bn.running_var", "base_model.inception_5a_double_3x3_1.weight", "base_model.inception_5a_double_3x3_1.bias", "base_model.inception_5a_double_3x3_1_bn.weight", "base_model.inception_5a_double_3x3_1_bn.bias", "base_model.inception_5a_double_3x3_1_bn.running_mean", "base_model.inception_5a_double_3x3_1_bn.running_var", "base_model.inception_5a_double_3x3_2.weight", "base_model.inception_5a_double_3x3_2.bias", "base_model.inception_5a_double_3x3_2_bn.weight", "base_model.inception_5a_double_3x3_2_bn.bias", "base_model.inception_5a_double_3x3_2_bn.running_mean", "base_model.inception_5a_double_3x3_2_bn.running_var", "base_model.inception_5a_pool_proj.weight", "base_model.inception_5a_pool_proj.bias", "base_model.inception_5a_pool_proj_bn.weight", "base_model.inception_5a_pool_proj_bn.bias", "base_model.inception_5a_pool_proj_bn.running_mean", "base_model.inception_5a_pool_proj_bn.running_var", "base_model.inception_5b_1x1.weight", "base_model.inception_5b_1x1.bias", "base_model.inception_5b_1x1_bn.weight", "base_model.inception_5b_1x1_bn.bias", "base_model.inception_5b_1x1_bn.running_mean", "base_model.inception_5b_1x1_bn.running_var", "base_model.inception_5b_3x3_reduce.weight", "base_model.inception_5b_3x3_reduce.bias", "base_model.inception_5b_3x3_reduce_bn.weight", "base_model.inception_5b_3x3_reduce_bn.bias", "base_model.inception_5b_3x3_reduce_bn.running_mean", "base_model.inception_5b_3x3_reduce_bn.running_var", "base_model.inception_5b_3x3.weight", "base_model.inception_5b_3x3.bias", "base_model.inception_5b_3x3_bn.weight", "base_model.inception_5b_3x3_bn.bias", "base_model.inception_5b_3x3_bn.running_mean", "base_model.inception_5b_3x3_bn.running_var", "base_model.inception_5b_double_3x3_reduce.weight", "base_model.inception_5b_double_3x3_reduce.bias", "base_model.inception_5b_double_3x3_reduce_bn.weight", "base_model.inception_5b_double_3x3_reduce_bn.bias", "base_model.inception_5b_double_3x3_reduce_bn.running_mean", "base_model.inception_5b_double_3x3_reduce_bn.running_var", "base_model.inception_5b_double_3x3_1.weight", "base_model.inception_5b_double_3x3_1.bias", "base_model.inception_5b_double_3x3_1_bn.weight", "base_model.inception_5b_double_3x3_1_bn.bias", "base_model.inception_5b_double_3x3_1_bn.running_mean", "base_model.inception_5b_double_3x3_1_bn.running_var", "base_model.inception_5b_double_3x3_2.weight", "base_model.inception_5b_double_3x3_2.bias", "base_model.inception_5b_double_3x3_2_bn.weight", "base_model.inception_5b_double_3x3_2_bn.bias", "base_model.inception_5b_double_3x3_2_bn.running_mean", "base_model.inception_5b_double_3x3_2_bn.running_var", "base_model.inception_5b_pool_proj.weight", "base_model.inception_5b_pool_proj.bias", "base_model.inception_5b_pool_proj_bn.weight", "base_model.inception_5b_pool_proj_bn.bias", "base_model.inception_5b_pool_proj_bn.running_mean", "base_model.inception_5b_pool_proj_bn.running_var". While copying the parameter named "new_fc.weight", whose dimensions in the model are torch.Size([256, 2048]) and whose dimensions in the checkpoint are torch.Size([256, 1024]).

michaelzhang917 avatar May 28 '18 18:05 michaelzhang917

It works with BNInception architecture but not InceptionV3

michaelzhang917 avatar May 28 '18 21:05 michaelzhang917

@alexandonian @michaelzhang917 hey, I have some problems about retraining the TRN with pytorch-v0.4.0, your answers above could solve the trouble with training the new model with pytorchv0.4.0? Looking forward to your any replies.

Ai-is-light avatar Aug 22 '18 03:08 Ai-is-light

I encountered the same problem across versions for my own model with batch_norm.

zhanwenchen avatar Feb 08 '19 15:02 zhanwenchen

I downloaded this 'http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-52deb4733.pth' then replaced https://github.com/yjxiong/tensorflow-model-zoo.torch/blob/e31e0b7aa451e2c12c0107e616953a03d8cd0d47/bninception/pytorch_load.py#L35 with new_state_dict = {} temp2 = torch.load('model_zoo/bninception/bn_inception-52deb4733.pth') for k,v in temp2.items(): if (k.split(".")[0]=='last_linear'): new_state_dict['fc.'+k.split(".")[1]] = v else: new_state_dict[k] = v self.load_state_dict(new_state_dict, strict = False)

kritiksoman avatar Jun 21 '19 06:06 kritiksoman

I downloaded this 'http://data.lip6.fr/cadene/pretrainedmodels/bn_inception-52deb4733.pth' then replaced https://github.com/yjxiong/tensorflow-model-zoo.torch/blob/e31e0b7aa451e2c12c0107e616953a03d8cd0d47/bninception/pytorch_load.py#L35 with new_state_dict = {} temp2 = torch.load('model_zoo/bninception/bn_inception-52deb4733.pth') for k,v in temp2.items(): if (k.split(".")[0]=='last_linear'): new_state_dict['fc.'+k.split(".")[1]] = v else: new_state_dict[k] = v self.load_state_dict(new_state_dict, strict = False)

Please, could you upload the data links again ?

dralmadani avatar Jul 07 '21 04:07 dralmadani