trt_pose icon indicating copy to clipboard operation
trt_pose copied to clipboard

How to increase Model accuracy

Open AmoghBharadwaj opened this issue 4 years ago • 10 comments

Hello Team,

I am trying to run Human Pose estimation on video data and the model seems to miss the detections say for ex when the person is stretching his hands, tilts a bit, often gives wrong detections. Can you please let me know how to increase the model predictions accuracy ? Thank you.

AmoghBharadwaj avatar Aug 03 '20 21:08 AmoghBharadwaj

Hi AmoghBharadwaj,

Thanks for reaching out.

Which model are you running? Typically the densenet121 model yields better accuracy than the resnet18 model.

You could also experiment with adjusting the image resolution.

Aside from that, training or experimenting with model architectures is likely necessary.

Please let me know if this helps or you have any questions.

Best, John

jaybdub avatar Aug 03 '20 21:08 jaybdub

Hello @jaybdub ,

I am using resnet model, will definitely try out densenet.

Also, can you please shed more light on training and experimenting with model architectures and how to do it ?

Thank you.

AmoghBharadwaj avatar Aug 03 '20 21:08 AmoghBharadwaj

Sure,

Training a Model

  1. Download coco

    cd tasks/human_pose
    source download_coco.sh
    unzip train2017.zip
    unzip val2017.zip
    unzip annotations_trainval2017.zip
    
  2. Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).

    python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
    
  3. Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/init.py#L7

  4. Create a model / training configuration. Easiest to start from an existing one.

    cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
    
  5. Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)

When you're defining you model, the main requirement is

  1. It takes an input image
  2. It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

jaybdub avatar Aug 03 '20 21:08 jaybdub

Sure,

Training a Model

  1. Download coco
    cd tasks/human_pose
    source download_coco.sh
    unzip train2017.zip
    unzip val2017.zip
    unzip annotations_trainval2017.zip
    
  2. Pre-process the coco annotations. This adds the "Neck" keypoint (midpoint of shoulders).
    python3 preprocess_coco_person.py annotations/person_keypoints_train2017.json annotations/person_keypoints_train2017_modified.json
    
  3. Add your model to this list to register it https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/trt_pose/models/init.py#L7
  4. Create a model / training configuration. Easiest to start from an existing one.
    cp experiments/resnet18_baseline_att_224x224_A.json experiments/my_model.json
    
  5. Set the model and arguments corresponding to how you defined / registered your model (see https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L48 for example)

When you're defining you model, the main requirement is

  1. It takes an input image
  2. It returns cmap and paf feature maps. The spatial shape should match the target you set when training (https://github.com/NVIDIA-AI-IOT/trt_pose/blob/master/tasks/human_pose/experiments/resnet18_baseline_att_224x224_A.json#L7)

I'd start trying densenet121. The models listed in the README are the product of a fair amount of experimentation. But there definitely may be improvements you could make. It's possible these improvements may have more to do with other factors such as improvements to the training script, data augmentation methods than architecture itself, which would require more modification to the actual code, but it could be architecture as well.

Hope this helps, let me know if you run into any issues.

Best, John

@jaybdub How can I execute the training script after completing the above steps? I have configured the JSON file. How can I use it and how can I train? Can you tell me more about it? Thank you very much!

dreamilk avatar Oct 07 '20 09:10 dreamilk

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

tucachmo2202 avatar Feb 26 '21 01:02 tucachmo2202

@dreamilk have you retrained successfully? I want to retrain with my dataset and I am not clear about how to do. Could you tell me the way? Thanks very much!

I have tried training, and some of the official JSON files can be used for training. If I remember correctly, train.py It's for training. train.py There may be errors, so you need to modify the problem yourself. For example, from. Models import models may need to be modified to from trt_pose.Models import models. The specific situation needs specific analysis. Good luck!

dreamilk avatar Feb 26 '21 14:02 dreamilk

@dreamilk thank for your reply. I am making my own dataset (almost are camera images) and I concern about accuracy you got or is this model suitable for action detection problem?

tucachmo2202 avatar Feb 27 '21 23:02 tucachmo2202

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

spacewalk01 avatar Jul 27 '21 07:07 spacewalk01

can you share the model? thanks

chh7411898 avatar Dec 20 '21 03:12 chh7411898

@jaybdub how to train a larger model with 416x416 or 608x608. The current model size is quiet small and cannot detect small objects.

Hi, hope you still need. You can retrain model follow this guide. And change "image_shape" in config file to 416 or 608, "target_shape" = "image_shape"/4. For example if image_shape is [416, 416], target_shape is [104, 104].

tucachmo2202 avatar Dec 20 '21 10:12 tucachmo2202