ssd.pytorch icon indicating copy to clipboard operation
ssd.pytorch copied to clipboard

RuntimeError: Error(s) in loading state_dict for SSD

Open SalahAdDin opened this issue 5 years ago • 3 comments

Hello guys

I'm testing this SSD's implementation to test the algorithm; i already trained the model based on COCO dataset using this fork; after one week i got the trained model, but now, when i try to test or eval the model, i get this error:

python eval.py --trained_model=checkpoints/ssd300_COCO_395000.pth
/home/joselito92/Projects/thesis/ssd/amdegroot/ssd.pytorch/ssd.py:34: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.priors = Variable(self.priorbox.forward(), volatile=True)
Traceback (most recent call last):
  File "eval.py", line 416, in <module>
    net.load_state_dict(torch.load(args.trained_model))
  File "/home/joselito92/Projects/thesis/ssd/amdegroot/lib/python3.7/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SSD:
        size mismatch for conf.0.weight: copying a param with shape torch.Size([804, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([84, 512, 3, 3]).
        size mismatch for conf.0.bias: copying a param with shape torch.Size([804]) from checkpoint, the shape in current model is torch.Size([84]).
        size mismatch for conf.1.weight: copying a param with shape torch.Size([1206, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 1024, 3, 3]).
        size mismatch for conf.1.bias: copying a param with shape torch.Size([1206]) from checkpoint, the shape in current model is torch.Size([126]).
        size mismatch for conf.2.weight: copying a param with shape torch.Size([1206, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 512, 3, 3]).
        size mismatch for conf.2.bias: copying a param with shape torch.Size([1206]) from checkpoint, the shape in current model is torch.Size([126]).
        size mismatch for conf.3.weight: copying a param with shape torch.Size([1206, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([126, 256, 3, 3]).
        size mismatch for conf.3.bias: copying a param with shape torch.Size([1206]) from checkpoint, the shape in current model is torch.Size([126]).
        size mismatch for conf.4.weight: copying a param with shape torch.Size([804, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([84, 256, 3, 3]).
        size mismatch for conf.4.bias: copying a param with shape torch.Size([804]) from checkpoint, the shape in current model is torch.Size([84]).
        size mismatch for conf.5.weight: copying a param with shape torch.Size([804, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([84, 256, 3, 3]).
        size mismatch for conf.5.bias: copying a param with shape torch.Size([804]) from checkpoint, the shape in current model is torch.Size([84]).

What's the problem here?

SalahAdDin avatar Jul 22 '19 11:07 SalahAdDin

@SalahAdDin You have a shape mismatch problem. can you modify the config file in the data before training?

engmubarak48 avatar Jul 22 '19 11:07 engmubarak48

https://github.com/amdegroot/ssd.pytorch/issues/342#issuecomment-514614014

Ouwzhong avatar Jul 24 '19 12:07 Ouwzhong

That's because before you train your network, there's a file that hasn't been modified. This file is called config. py and is in the data folder. You should modify the num_classes based on the dataset format you use and the number of classes in your custom dataset, if you are working with in my case, I had 2 classes I set it to 3- including the background.

Do the same in ssd.py while defining the build_ssd function

Mohit-robo avatar Aug 27 '22 15:08 Mohit-robo