trt_pose
trt_pose copied to clipboard
Getting the optimized model takes forever
Here is my code based on the demo to get the resnet18_baseline_att_224x224_A_epoch_249_trt.pth
optimized model:
import json
import trt_pose.coco
with open('human_pose.json', 'r') as f:
human_pose = json.load(f)
topology = trt_pose.coco.coco_category_to_topology(human_pose)
import trt_pose.models
num_parts = len(human_pose['keypoints'])
num_links = len(human_pose['skeleton'])
model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()
print(num_parts)
print(num_links)
import torch
MODEL_WEIGHTS = 'resnet18_baseline_att_224x224_A_epoch_249.pth'
model.load_state_dict(torch.load(MODEL_WEIGHTS))
WIDTH = 224
HEIGHT = 224
data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()
import torch2trt
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
OPTIMIZED_MODEL = 'resnet18_baseline_att_224x224_A_epoch_249_trt.pth'
torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)
it has been running for more than 2 hours already, and the Jetson Nano is not responsive at all.
Is it the expected behavior? How long does it usually take to generate it?
After long wait I got this:
python3 humanpose.py
18
21
[1388.798163] python3 invoked com-killer: gfp_nesk=0x24082c2(GFP KERNELI_GFP_HIGHMEMI_GFP_NOHARNI_GFP ZERO), nodemask=0, order=0, oom score adJ=0
[1388.812530 ] Out of memory: Kill process 7187 (python3) score 536 or sacrifice child
[1388.820457 ] KIlled process 7187 (python3) total-vm: 12791888KB, anon-rss:1799072KB, file-rss:367720KB, shmen-rss:0kB KIlled
Hi, it took me ~5 mins to run on a Jetson Nano. I ran the exact same code on trt_pose/tasks/human_pose/live_demo.ipynb
. Looks like your Nano ran out of memory, perhaps consider adding swap space to your system. I have followed this tutorial in the past.
i have 8gb of swap, and it takes about 5-8 min.