trt_pose How to increase Model accuracy

Hello Team,

I am trying to run Human Pose estimation on video data and the model seems to miss the detections say for ex when the person is stretching his hands, tilts a bit, often gives wrong detections. Can you please let me know how to increase the model predictions accuracy ? Thank you.

Aug 03 '20 21:08 AmoghBharadwaj

Hi AmoghBharadwaj,

Thanks for reaching out.

Which model are you running? Typically the densenet121 model yields better accuracy than the resnet18 model.

You could also experiment with adjusting the image resolution.

Aside from that, training or experimenting with model architectures is likely necessary.

Please let me know if this helps or you have any questions.

Best, John

Aug 03 '20 21:08 jaybdub

Hello @jaybdub ,

I am using resnet model, will definitely try out densenet.

Also, can you please shed more light on training and experimenting with model architectures and how to do it ?

Thank you.

Aug 03 '20 21:08 AmoghBharadwaj

Sure,

Training a Model

Download coco

cd tasks/human_pose
source download_coco.sh
unzip train2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip

Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).

python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json

Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/init.py#L7
Create a model / training configuration. Easiest to start from an existing one.
```
cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
```
Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)

When you're defining you model, the main requirement is

It takes an input image
It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

Aug 03 '20 21:08 jaybdub

Sure,

Training a Model
Download coco
cd tasks/human_pose
source download_coco.sh
unzip train2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip
Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).
python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/init.py#L7
Create a model / training configuration. Easiest to start from an existing one.
cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)
When you're defining you model, the main requirement is

It takes an input image

It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

@jaybdub How can I execute the training script after completing the above steps? I have configured the JSON file. How can I use it and how can I train? Can you tell me more about it? Thank you very much!

Oct 07 '20 09:10 dreamilk

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

Feb 26 '21 01:02 tucachmo2202

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

I have tried training, and some of the official JSON files can be used for training. If I remember correctly, train.py It's for training. train.py There may be errors, so you need to modify the problem yourself. For example, from. Models import models may need to be modified to from trt_pose.Models import models. The specific situation needs specific analysis. Good luck！

Feb 26 '21 14:02 dreamilk

@dreamilk thank for your reply. I am making my own dataset (almost are camera images) and I concern about accuracy you got or is this model suitable for action detection problem?

Feb 27 '21 23:02 tucachmo2202

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

Jul 27 '21 07:07 spacewalk01

can you share the model? thanks

Dec 20 '21 03:12 chh7411898

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

Hi, hope you still need. You can retrain model follow this guide. And change "image_shape" in config file to 416 or 608, "target_shape" = "image_shape"/4. For example if image_shape is [416, 416], target_shape is [104, 104].

Dec 20 '21 10:12 tucachmo2202

trt_pose trt_pose copied to clipboard

How to increase Model accuracy

Training a Model

Training a Model

trt_pose
trt_pose copied to clipboard