getting an error while running inference.py
I am getting this error while running inference.py. I have tried reinstallt torch and torchvision for different versions but nothing seems to work
(dope_training) add@add-MS-7C84:~/kick_blenderproc/Deep_Object_Pose/train$ python ../inference/inference.py --weights output/weights/net_epoch_60.pth --data palletjack_data_test/ --object palletjack /home/add/kick_blenderproc/Deep_Object_Pose/train/output/dope_training/lib/python3.8/site-packages/albumentations/__init__.py:13: UserWarning: A new version of Albumentations is available: 2.0.2 (you have 1.4.18). Upgrade using: pip install -U albumentations. To disable automatic update checks, set the environment variable NO_ALBUMENTATIONS_UPDATE to 1. check_for_updates() Found 1 weights. Loading DOPE model 'output/weights/net_epoch_60.pth'... /home/add/kick_blenderproc/Deep_Object_Pose/train/output/dope_training/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( /home/add/kick_blenderproc/Deep_Object_Pose/train/output/dope_training/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or Nonefor 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=None. warnings.warn(msg) /home/add/kick_blenderproc/Deep_Object_Pose/train/../common/detector.py:274: FutureWarning: You are using torch.loadwithweights_only=False(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_onlywill be flipped toTrue. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=Truefor any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. net.load_state_dict(torch.load(path, map_location=device)) Traceback (most recent call last): File "../inference/inference.py", line 258, in <module> dope_node = DopeNode(config, weight, opt.parallel, opt.object) File "../inference/inference.py", line 49, in __init__ self.model.load_net_model() File "/home/addb/kick_blenderproc/Deep_Object_Pose/train/../common/detector.py", line 253, in load_net_model self.net = self.load_net_model_path(self.net_path) File "/home/add/kick_blenderproc/Deep_Object_Pose/train/../common/detector.py", line 274, in load_net_model_path net.load_state_dict(torch.load(path, map_location=device)) File "/home/add/kick_blenderproc/Deep_Object_Pose/train/output/dope_training/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DopeNetwork: Missing key(s) in state_dict: "vgg.0.weight", "vgg.0.bias", "vgg.2.weight", "vgg.2.bias", "vgg.5.weight", "vgg.5.bias", "vgg.7.weight", "vgg.7.bias", "vgg.10.weight", "vgg.10.bias", "vgg.12.weight", "vgg.12.bias", "vgg.14.weight", "vgg.14.bias", "vgg.16.weight", "vgg.16.bias", "vgg.19.weight", "vgg.19.bias", "vgg.21.weight", "vgg.21.bias", "vgg.23.weight", "vgg.23.bias", "vgg.25.weight", "vgg.25.bias", "m1_2.0.weight", "m1_2.0.bias", "m1_2.2.weight", "m1_2.2.bias", "m1_2.4.weight", "m1_2.4.bias", "m1_2.6.weight", "m1_2.6.bias", "m1_2.8.weight", "m1_2.8.bias", "m2_2.0.weight", "m2_2.0.bias", "m2_2.2.weight", "m2_2.2.bias", "m2_2.4.weight", "m2_2.4.bias", "m2_2.6.weight", "m2_2.6.bias", "m2_2.8.weight", "m2_2.8.bias", "m2_2.10.weight", "m2_2.10.bias", "m2_2.12.weight", "m2_2.12.bias", "m3_2.0.weight", "m3_2.0.bias", "m3_2.2.weight", "m3_2.2.bias", "m3_2.4.weight", "m3_2.4.bias", "m3_2.6.weight", "m3_2.6.bias", "m3_2.8.weight", "m3_2.8.bias", "m3_2.10.weight", "m3_2.10.bias", "m3_2.12.weight", "m3_2.12.bias", "m4_2.0.weight", "m4_2.0.bias", "m4_2.2.weight", "m4_2.2.bias", "m4_2.4.weight", "m4_2.4.bias", "m4_2.6.weight", "m4_2.6.bias", "m4_2.8.weight", "m4_2.8.bias", "m4_2.10.weight", "m4_2.10.bias", "m4_2.12.weight", "m4_2.12.bias", "m5_2.0.weight", "m5_2.0.bias", "m5_2.2.weight", "m5_2.2.bias", "m5_2.4.weight", "m5_2.4.bias", "m5_2.6.weight", "m5_2.6.bias", "m5_2.8.weight", "m5_2.8.bias", "m5_2.10.weight", "m5_2.10.bias", "m5_2.12.weight", "m5_2.12.bias", "m6_2.0.weight", "m6_2.0.bias", "m6_2.2.weight", "m6_2.2.bias", "m6_2.4.weight", "m6_2.4.bias", "m6_2.6.weight", "m6_2.6.bias", "m6_2.8.weight", "m6_2.8.bias", "m6_2.10.weight", "m6_2.10.bias", "m6_2.12.weight", "m6_2.12.bias", "m1_1.0.weight", "m1_1.0.bias", "m1_1.2.weight", "m1_1.2.bias", "m1_1.4.weight", "m1_1.4.bias", "m1_1.6.weight", "m1_1.6.bias", "m1_1.8.weight", "m1_1.8.bias", "m2_1.0.weight", "m2_1.0.bias", "m2_1.2.weight", "m2_1.2.bias", "m2_1.4.weight", "m2_1.4.bias", "m2_1.6.weight", "m2_1.6.bias", "m2_1.8.weight", "m2_1.8.bias", "m2_1.10.weight", "m2_1.10.bias", "m2_1.12.weight", "m2_1.12.bias", "m3_1.0.weight", "m3_1.0.bias", "m3_1.2.weight", "m3_1.2.bias", "m3_1.4.weight", "m3_1.4.bias", "m3_1.6.weight", "m3_1.6.bias", "m3_1.8.weight", "m3_1.8.bias", "m3_1.10.weight", "m3_1.10.bias", "m3_1.12.weight", "m3_1.12.bias", "m4_1.0.weight", "m4_1.0.bias", "m4_1.2.weight", "m4_1.2.bias", "m4_1.4.weight", "m4_1.4.bias", "m4_1.6.weight", "m4_1.6.bias", "m4_1.8.weight", "m4_1.8.bias", "m4_1.10.weight", "m4_1.10.bias", "m4_1.12.weight", "m4_1.12.bias", "m5_1.0.weight", "m5_1.0.bias", "m5_1.2.weight", "m5_1.2.bias", "m5_1.4.weight", "m5_1.4.bias", "m5_1.6.weight", "m5_1.6.bias", "m5_1.8.weight", "m5_1.8.bias", "m5_1.10.weight", "m5_1.10.bias", "m5_1.12.weight", "m5_1.12.bias", "m6_1.0.weight", "m6_1.0.bias", "m6_1.2.weight", "m6_1.2.bias", "m6_1.4.weight", "m6_1.4.bias", "m6_1.6.weight", "m6_1.6.bias", "m6_1.8.weight", "m6_1.8.bias", "m6_1.10.weight", "m6_1.10.bias", "m6_1.12.weight", "m6_1.12.bias". Unexpected key(s) in state_dict: "module.vgg.0.weight", "module.vgg.0.bias", "module.vgg.2.weight", "module.vgg.2.bias", "module.vgg.5.weight", "module.vgg.5.bias", "module.vgg.7.weight", "module.vgg.7.bias", "module.vgg.10.weight", "module.vgg.10.bias", "module.vgg.12.weight", "module.vgg.12.bias", "module.vgg.14.weight", "module.vgg.14.bias", "module.vgg.16.weight", "module.vgg.16.bias", "module.vgg.19.weight", "module.vgg.19.bias", "module.vgg.21.weight", "module.vgg.21.bias", "module.vgg.23.weight", "module.vgg.23.bias", "module.vgg.25.weight", "module.vgg.25.bias", "module.m1_2.0.weight", "module.m1_2.0.bias", "module.m1_2.2.weight", "module.m1_2.2.bias", "module.m1_2.4.weight", "module.m1_2.4.bias", "module.m1_2.6.weight", "module.m1_2.6.bias", "module.m1_2.8.weight", "module.m1_2.8.bias", "module.m2_2.0.weight", "module.m2_2.0.bias", "module.m2_2.2.weight", "module.m2_2.2.bias", "module.m2_2.4.weight", "module.m2_2.4.bias", "module.m2_2.6.weight", "module.m2_2.6.bias", "module.m2_2.8.weight", "module.m2_2.8.bias", "module.m2_2.10.weight", "module.m2_2.10.bias", "module.m2_2.12.weight", "module.m2_2.12.bias", "module.m3_2.0.weight", "module.m3_2.0.bias", "module.m3_2.2.weight", "module.m3_2.2.bias", "module.m3_2.4.weight", "module.m3_2.4.bias", "module.m3_2.6.weight", "module.m3_2.6.bias", "module.m3_2.8.weight", "module.m3_2.8.bias", "module.m3_2.10.weight", "module.m3_2.10.bias", "module.m3_2.12.weight", "module.m3_2.12.bias", "module.m4_2.0.weight", "module.m4_2.0.bias", "module.m4_2.2.weight", "module.m4_2.2.bias", "module.m4_2.4.weight", "module.m4_2.4.bias", "module.m4_2.6.weight", "module.m4_2.6.bias", "module.m4_2.8.weight", "module.m4_2.8.bias", "module.m4_2.10.weight", "module.m4_2.10.bias", "module.m4_2.12.weight", "module.m4_2.12.bias", "module.m5_2.0.weight", "module.m5_2.0.bias", "module.m5_2.2.weight", "module.m5_2.2.bias", "module.m5_2.4.weight", "module.m5_2.4.bias", "module.m5_2.6.weight", "module.m5_2.6.bias", "module.m5_2.8.weight", "module.m5_2.8.bias", "module.m5_2.10.weight", "module.m5_2.10.bias", "module.m5_2.12.weight", "module.m5_2.12.bias", "module.m6_2.0.weight", "module.m6_2.0.bias", "module.m6_2.2.weight", "module.m6_2.2.bias", "module.m6_2.4.weight", "module.m6_2.4.bias", "module.m6_2.6.weight", "module.m6_2.6.bias", "module.m6_2.8.weight", "module.m6_2.8.bias", "module.m6_2.10.weight", "module.m6_2.10.bias", "module.m6_2.12.weight", "module.m6_2.12.bias", "module.m1_1.0.weight", "module.m1_1.0.bias", "module.m1_1.2.weight", "module.m1_1.2.bias", "module.m1_1.4.weight", "module.m1_1.4.bias", "module.m1_1.6.weight", "module.m1_1.6.bias", "module.m1_1.8.weight", "module.m1_1.8.bias", "module.m2_1.0.weight", "module.m2_1.0.bias", "module.m2_1.2.weight", "module.m2_1.2.bias", "module.m2_1.4.weight", "module.m2_1.4.bias", "module.m2_1.6.weight", "module.m2_1.6.bias", "module.m2_1.8.weight", "module.m2_1.8.bias", "module.m2_1.10.weight", "module.m2_1.10.bias", "module.m2_1.12.weight", "module.m2_1.12.bias", "module.m3_1.0.weight", "module.m3_1.0.bias", "module.m3_1.2.weight", "module.m3_1.2.bias", "module.m3_1.4.weight", "module.m3_1.4.bias", "module.m3_1.6.weight", "module.m3_1.6.bias", "module.m3_1.8.weight", "module.m3_1.8.bias", "module.m3_1.10.weight", "module.m3_1.10.bias", "module.m3_1.12.weight", "module.m3_1.12.bias", "module.m4_1.0.weight", "module.m4_1.0.bias", "module.m4_1.2.weight", "module.m4_1.2.bias", "module.m4_1.4.weight", "module.m4_1.4.bias", "module.m4_1.6.weight", "module.m4_1.6.bias", "module.m4_1.8.weight", "module.m4_1.8.bias", "module.m4_1.10.weight", "module.m4_1.10.bias", "module.m4_1.12.weight", "module.m4_1.12.bias", "module.m5_1.0.weight", "module.m5_1.0.bias", "module.m5_1.2.weight", "module.m5_1.2.bias", "module.m5_1.4.weight", "module.m5_1.4.bias", "module.m5_1.6.weight", "module.m5_1.6.bias", "module.m5_1.8.weight", "module.m5_1.8.bias", "module.m5_1.10.weight", "module.m5_1.10.bias", "module.m5_1.12.weight", "module.m5_1.12.bias", "module.m6_1.0.weight", "module.m6_1.0.bias", "module.m6_1.2.weight", "module.m6_1.2.bias", "module.m6_1.4.weight", "module.m6_1.4.bias", "module.m6_1.6.weight", "module.m6_1.6.bias", "module.m6_1.8.weight", "module.m6_1.8.bias", "module.m6_1.10.weight", "module.m6_1.10.bias", "module.m6_1.12.weight", "module.m6_1.12.bias".
I also downgraded torchvision to 0.12.1 as per this but it said that my RTX 3090 doesnt support this version. Kindly help me in this matter
I have the same question, do you solve this problem?
I think you need to downgrade torch, I think someone else had a similar problem, check the other issues.
@liuwenchao12480 No I was not able to solve this problem. I tried downgrading it but my GPU didn't support that version so I tried different versions of torch and torchvision but it still didn't work so I gave up.If you are able to resolve it please do tell me
Check with chat gpt on how to rename the modules so they would match!
The reason for this issue is that you used multiple GPUs for training and added a model prefix before the model. You only need to disable multi GPU training during the training process @liuwenchao12480
I run into the same problem while trying to execute example cmmand python inference.py --weights ../weights --data ../sample_data --object cracker
The code base is not maintained anymore, happy to accept a PR for this problem though. https://chatgpt.com/share/67f7df12-2c10-8013-8af1-db769cfc7a85 please check this to fix your problem.
To complete @TontonTremblay’s answer, the issue is quite simple: the error is caused by the dictionary you’re saving not having the same key names the model loader expects. This is probably due to a difference in PyTorch.
I created this script with ChatGPT. It converts all the keys in the weight files (.pth) to the names the model loader expects.
#!/usr/bin/env python3
import os
import torch
from collections import OrderedDict
WEIGHTS_DIR = "./output/weights"
for filename in os.listdir(WEIGHTS_DIR):
if filename.endswith(".pth") and filename.startswith("net_epoch_"):
file_path = os.path.join(WEIGHTS_DIR, filename)
print(f"Processing {file_path} ...")
# Load original state dict
state_dict = torch.load(file_path, map_location="cpu")
# Only convert if 'module.' prefix exists
keys = list(state_dict.keys())
if keys[0].startswith("module."):
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = k[7:] if k.startswith("module.") else k
new_state_dict[name] = v
# Save new file with _cleaned suffix
cleaned_path = os.path.join(WEIGHTS_DIR, filename.replace(".pth", "_cleaned.pth"))
torch.save(new_state_dict, cleaned_path)
print(f"Saved cleaned weights to {cleaned_path}")
else:
print(f"No 'module.' prefix found in {filename}, skipping.")
This is what the --parallel flag is for in inference.py