trt_pose
trt_pose copied to clipboard
Training the model
Hello Nvidia AI-IOT team,
First of all thank you very much for your effort in creating this code. I am Zeyan and currently working on real time pose estimation implementation on Jetson AGX Xavier. My goal is to use Depths image (from Intel real sense camera) and check whether the depths information could help improve the performance of pose estimation or not.
Before I conduct my experiments. First I wish to train the model to act as an base line for our experiments. From your training script it seems config.json file is required to trained the network. As i wish to follow your parameters for this baseline training. It would be great if you could provide me your conifg file so that I could follow your step and parameters to train your model.
Thanks in advance for your help and support. I will be looking forward for your reply. Please let me know if you have anything to say,
Thanks Dr. Zeyan Oo
Hi dy1ngs0ul,
Thanks for reaching out!
You may find the training configuration files in this directory
https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json
Please let me know if you have any questions.
Best, John
Thanks for your help
@jaybdub , Thanks for your excellent work!
So far I think cmap_channels
means keypoint numbers, paf_channels
equals to 2*connections, Can you explain upsample_channels
means?
"model": {
"name": "densenet121_baseline_att",
"kwargs": {
"cmap_channels": 18,
"paf_channels": 42,
"upsample_channels": 256,
"num_upsample": 3
}
},
Hi guys! Have any of you succesfully completed any training using the script provided within the repo?
I'm trying to prune the models but I can't seem to be able to proceed with retraining using train.py because of inconcistency between paf tensors' size:
Traceback (most recent call last): File "provaTrain.py", line 150, in <module> paf_mse = torch.mean(mask * (paf_out - paf)**2) File "/usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/wrap.py", line 58, in wrapper return orig_fn(*new_args, **kwargs) RuntimeError: The size of tensor a (42) must match the size of tensor b (38) at non-singleton dimension 1
I solved it myself, thank you anyway!
Hey all, I'm having a similar error as @NicolaGugole using the training dataset downloaded through the provided shell script.
Any tips on how to fix this would be greatly appreciated !
Edit:Nevermind, one just has to edit the model
attribute of the json file referenced earlier to match tensor sizes.
Hey all, I'm having a similar error as @NicolaGugole using the training dataset downloaded through the provided shell script. Any tips on how to fix this would be greatly appreciated ! Edit:Nevermind, one just has to edit the
model
attribute of the json file referenced earlier to match tensor sizes.
In my case I had to change the annotation file because I noticed a difference between the annotation keypoints number (17 keypoints) and the human_pose.json number (18 keypoints). This difference in tensor sizes is weird in my opinion. Forcing this sizes to match does not create a fruitful training in my case, I assume because of the fact that the annotation files contain values created to match 17 keypoints while we modified them to match 18 keypoints.
I noticed that in this config file (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json) the devs used a "modified" version of the json file. I hope in the near future we'll have the opportunity to take a look at the modified version of these json files (maybe the devs could upload the files to this repo).
So I have a question @OliverGuy : did you just change the kwargs cmap_channels and paf_channels in the json file referenced earlier? Did that do the job? I tried to do the same but ended up with other conflicts.
Sorry for bothering you all, Have a nice day!
@NicolaGugole I only modified those in the json, but I'm having issues with CudNN not finding the convolution algorithm (see #54).
@NicolaGugole
You have to pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders) so that you will have 18 keypoints. Use the command:
python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
@jaybdub , Thanks for your excellent work! So far I think
cmap_channels
means keypoint numbers,paf_channels
equals to 2*connections, Can you explainupsample_channels
means?"model": { "name": "densenet121_baseline_att", "kwargs": { "cmap_channels": 18, "paf_channels": 42, "upsample_channels": 256, "num_upsample": 3 } },
Did you figure out what upsample_channels
means?
I am struggling with the same issue as you.