pytorch-classification Error(s) in loading state

Error(s) in loading state_dict for DataParallel:

Open cosmolu opened this issue 5 years ago • 11 comments

When I download the pre_trained model and resume it. there is an error. model.load_state_dict(checkpoint['state_dict']) It seems that the name are not matched.(e.g. "module.features.0.weight" v.s. "features.module.0.weight") How could I solve it if I wish to use the pre_trained model on Cifar10? Thank you !

Traceback (most recent call last): File "test_0.py", line 130, in model = load_model() File "test_0.py", line 104, in load_model model.load_state_dict(checkpoint['state_dict']) File "/home/cosmo/anaconda3/envs/tf8/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.features.0.weight", "module.features.0.bias", "module.features.3.weight", "module.features.3.bias", "module.features.6.weight", "module.features.6.bias", "module.features.8.weight", "module.features.8.bias", "module.features.10.weight", "module.features.10.bias", "module.classifier.weight", "module.classifier.bias". Unexpected key(s) in state_dict: "features.module.0.weight", "features.module.0.bias", "features.module.3.weight", "features.module.3.bias", "features.module.6.weight", "features.module.6.bias", "features.module.8.weight", "features.module.8.bias", "features.module.10.weight", "features.module.10.bias", "classifier.weight", "classifier.bias".

Oct 15 '18 07:10 cosmolu

Hi,

The problem is the module is load with dataparallel activated and you are trying to load it without data parallel. That's why there's an extra module at the beginning of each key!

Refer to this link for more information: https://discuss.pytorch.org/t/missing-keys-unexpected-keys-in-state-dict-when-loading-self-trained-model/22379

Nov 27 '18 04:11 imirzadeh

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

Mar 20 '19 15:03 goncalomordido

The mutiple GPUs usage in pytorch is a little difficult. In TF, you just set the os.environ["CUDA_VISIBLE_DEVICES"]='0,1,2,3'

Apr 17 '19 02:04 xjock

Am getting an error similar to this one This is what I am running: python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar' ('checkpoint.pth.tar' is from the onedrive folder)

RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var". Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var". size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]). size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]). size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]). size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]). size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]). size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).

Jul 25 '19 07:07 raghav1810

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

I suppose this issue can be closed as the referenced post mentions the cause of the error and offers a solution

Mar 17 '20 11:03 jiteshm17

You can also manually updated the dic. Like this:

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

life saver!

Jan 28 '21 22:01 yustiks

Am getting an error similar to this one This is what I am running: python cifar.py -a preresnet --depth 110 --epochs 3 --schedule 81 122 --gamma 0.1 --wd 1e-4 --checkpoint checkpoints/cifar10/preresnet-110 --resume 'checkpoint.pth.tar' ('checkpoint.pth.tar' is from the onedrive folder)

RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.bn.weight", "module.bn.bias", "module.bn.running_mean", "module.bn.running_var". Unexpected key(s) in state_dict: "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.3.conv3.weight", "module.layer1.3.bn3.weight", "module.layer1.3.bn3.bias", "module.layer1.3.bn3.running_mean", "module.layer1.3.bn3.running_var", "module.layer1.4.conv3.weight", "module.layer1.4.bn3.weight", "module.layer1.4.bn3.bias", "module.layer1.4.bn3.running_mean", "module.layer1.4.bn3.running_var", "module.layer1.5.conv3.weight", "module.layer1.5.bn3.weight", "module.layer1.5.bn3.bias", "module.layer1.5.bn3.running_mean", "module.layer1.5.bn3.running_var", "module.layer1.6.conv3.weight", "module.layer1.6.bn3.weight", "module.layer1.6.bn3.bias", "module.layer1.6.bn3.running_mean", "module.layer1.6.bn3.running_var", "module.layer1.7.conv3.weight", "module.layer1.7.bn3.weight", "module.layer1.7.bn3.bias", "module.layer1.7.bn3.running_mean", "module.layer1.7.bn3.running_var", "module.layer1.8.conv3.weight", "module.layer1.8.bn3.weight", "module.layer1.8.bn3.bias", "module.layer1.8.bn3.running_mean", "module.layer1.8.bn3.running_var", "module.layer1.9.conv3.weight", "module.layer1.9.bn3.weight", "module.layer1.9.bn3.bias", "module.layer1.9.bn3.running_mean", "module.layer1.9.bn3.running_var", "module.layer1.10.conv3.weight", "module.layer1.10.bn3.weight", "module.layer1.10.bn3.bias", "module.layer1.10.bn3.running_mean", "module.layer1.10.bn3.running_var", "module.layer1.11.conv3.weight", "module.layer1.11.bn3.weight", "module.layer1.11.bn3.bias", "module.layer1.11.bn3.running_mean", "module.layer1.11.bn3.running_var", "module.layer1.12.conv3.weight", "module.layer1.12.bn3.weight", "module.layer1.12.bn3.bias", "module.layer1.12.bn3.running_mean", "module.layer1.12.bn3.running_var", "module.layer1.13.conv3.weight", "module.layer1.13.bn3.weight", "module.layer1.13.bn3.bias", "module.layer1.13.bn3.running_mean", "module.layer1.13.bn3.running_var", "module.layer1.14.conv3.weight", "module.layer1.14.bn3.weight", "module.layer1.14.bn3.bias", "module.layer1.14.bn3.running_mean", "module.layer1.14.bn3.running_var", "module.layer1.15.conv3.weight", "module.layer1.15.bn3.weight", "module.layer1.15.bn3.bias", "module.layer1.15.bn3.running_mean", "module.layer1.15.bn3.running_var", "module.layer1.16.conv3.weight", "module.layer1.16.bn3.weight", "module.layer1.16.bn3.bias", "module.layer1.16.bn3.running_mean", "module.layer1.16.bn3.running_var", "module.layer1.17.conv3.weight", "module.layer1.17.bn3.weight", "module.layer1.17.bn3.bias", "module.layer1.17.bn3.running_mean", "module.layer1.17.bn3.running_var", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.8.conv3.weight", "module.layer2.8.bn3.weight", "module.layer2.8.bn3.bias", "module.layer2.8.bn3.running_mean", "module.layer2.8.bn3.running_var", "module.layer2.9.conv3.weight", "module.layer2.9.bn3.weight", "module.layer2.9.bn3.bias", "module.layer2.9.bn3.running_mean", "module.layer2.9.bn3.running_var", "module.layer2.10.conv3.weight", "module.layer2.10.bn3.weight", "module.layer2.10.bn3.bias", "module.layer2.10.bn3.running_mean", "module.layer2.10.bn3.running_var", "module.layer2.11.conv3.weight", "module.layer2.11.bn3.weight", "module.layer2.11.bn3.bias", "module.layer2.11.bn3.running_mean", "module.layer2.11.bn3.running_var", "module.layer2.12.conv3.weight", "module.layer2.12.bn3.weight", "module.layer2.12.bn3.bias", "module.layer2.12.bn3.running_mean", "module.layer2.12.bn3.running_var", "module.layer2.13.conv3.weight", "module.layer2.13.bn3.weight", "module.layer2.13.bn3.bias", "module.layer2.13.bn3.running_mean", "module.layer2.13.bn3.running_var", "module.layer2.14.conv3.weight", "module.layer2.14.bn3.weight", "module.layer2.14.bn3.bias", "module.layer2.14.bn3.running_mean", "module.layer2.14.bn3.running_var", "module.layer2.15.conv3.weight", "module.layer2.15.bn3.weight", "module.layer2.15.bn3.bias", "module.layer2.15.bn3.running_mean", "module.layer2.15.bn3.running_var", "module.layer2.16.conv3.weight", "module.layer2.16.bn3.weight", "module.layer2.16.bn3.bias", "module.layer2.16.bn3.running_mean", "module.layer2.16.bn3.running_var", "module.layer2.17.conv3.weight", "module.layer2.17.bn3.weight", "module.layer2.17.bn3.bias", "module.layer2.17.bn3.running_mean", "module.layer2.17.bn3.running_var", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var". size mismatch for module.layer1.0.conv1.weight: copying a param with shape torch.Size([16, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.1.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.2.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.3.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.4.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.5.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.6.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.7.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.8.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.9.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.10.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.11.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.12.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.13.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.14.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.15.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.16.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer1.17.conv1.weight: copying a param with shape torch.Size([16, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]). size mismatch for module.layer2.0.bn1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.bn1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]). size mismatch for module.layer2.0.conv1.weight: copying a param with shape torch.Size([32, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]). size mismatch for module.layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]). size mismatch for module.layer2.1.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.2.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.3.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.4.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.5.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.6.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.7.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.8.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.9.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.10.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.11.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.12.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.13.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.14.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.15.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.16.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer2.17.conv1.weight: copying a param with shape torch.Size([32, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]). size mismatch for module.layer3.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]). size mismatch for module.layer3.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 3, 3]). size mismatch for module.layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 32, 1, 1]). size mismatch for module.layer3.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.3.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.4.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.5.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.6.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.7.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.8.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.9.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.10.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.11.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.12.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.13.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.14.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.15.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.16.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.layer3.17.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for module.fc.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([10, 64]). size mismatch for module.fc.bias: copying a param with shape torch.Size([100]) from checkpoint, the shape in current model is torch.Size([10]).

did you solve it please ?

Sep 27 '21 23:09 ersamo

        state_dict =checkpoint['state_dict']
        from collections import OrderedDict
        new_state_dict = OrderedDict()

        for k, v in state_dict.items():
            if 'module' not in k:
                k = 'module.'+k
            else:
                k = k.replace('features.module.', 'module.features.')
            new_state_dict[k]=v

        model.load_state_dict(new_state_dict)

This is the solution!!!! Thanks!!!!!

Feb 11 '22 02:02 Kin-Zhang

change: model.load_state_dict(torch.load(path + '/pytorch_model.pt')) to model.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)

Apr 09 '22 04:04 bilalghanem

change: model.load_state_dict(torch.load(path + '/pytorch_model.pt')) to model.load_state_dict(torch.load(path + '/pytorch_model.pt'), strict=False)

Although it will make the RuntimeError go away, don't do this unless you know what you are doing. It will leave any parameters it can't find in the checkpoint with random values. That's not what you want if the issue is caused by a mix-up of parameter names, as was the case for the issue reporter.

Nov 17 '22 08:11 mgrachten

Use model.module.state_dict() instead of model.state_dict() in DP mode

Dec 27 '23 09:12 Chenny0808

pytorch-classification pytorch-classification copied to clipboard

Error(s) in loading state_dict for DataParallel:

pytorch-classification
pytorch-classification copied to clipboard