trt_pose icon indicating copy to clipboard operation
trt_pose copied to clipboard

How to use model with deepstream

Open GstreamNovice opened this issue 4 years ago • 8 comments

Hello, I was curious to see if the pose-estimation model can be used with deepstream.

I believe I have to convert the model to tensorrt, use it as the pgie detection, and then use a second inference engine for the classifier, after that I need to implement some kind of post processing logic in gst-dsexample and finally have nvosd render the skeleton to the frame.

What else would I need to do ? Also should i have to implement the post processing in gstds-example ? If so how should i go about doing that ?

I know I am not the only one looking for an answer to this question. So any help would be greatly appreciated by me and by the community.

GstreamNovice avatar Mar 15 '20 02:03 GstreamNovice

Hi GstreamNovice,

Thanks for reaching out!

Do you mind sharing a reference to the DeepStream sample you're basing off of? I am less familiar in this area but will try to help how I can.

As for this project, there are a few stages

  1. Neural network execution (GPU, TensorRT). This takes one input binding (the image) and produces two output bindings (the confidence map and part affinity field).
  2. Post processing (CPU). This takes the confidence map and part affinity fields (copied into CPU memory), and produces the object counts, object part mappings, and object part coordinates.

Past that, it depends on the application. In the current examples, both the neural network execution and post processing depend on PyTorch for bindings. That said, I've taken steps to remove this dependency, which may simplify integrating in an application.

Please let me know if you have any questions, or are able to share your use case so I better understand the challenge your facing.

Best, John

jaybdub avatar Mar 15 '20 19:03 jaybdub

@GstreamNovice I am working on this same problem, were you successful?

ishang3 avatar Oct 11 '20 19:10 ishang3

@ishang3 first you need to follow the deepstream apps, for convinience, I think you need to write inference code in C++. It's like Pose plugin for deepstream. Then write the gstream plugin like yolo. So what's your problem? https://github.com/AlexeyAB/deepstream-plugins

thancaocuong avatar Nov 09 '20 03:11 thancaocuong

@thancaocuong the datastructure for trt pose is very different - it is not object detection which has the bounding box coordinates, but it contains 18 keypoints. Also, the preprocessing is a little different too where nvidia has not been transparent in what changes need to be done.

ishang3 avatar Nov 09 '20 03:11 ishang3

I did implement the preprocess on cuda including convert to RGB, resize, normalize. I will reoganize my code then share with you. Hope it will help

thancaocuong avatar Nov 09 '20 04:11 thancaocuong

@thancaocuong Thank you, I appreciate it very much.

ishang3 avatar Nov 09 '20 18:11 ishang3

@ishang3 please take a look at nvPreprocess function. I use cuda to normalize RGB image and DMA it to GPU for inference. You also can use my repo for trt_pose, but you need to add post-processing function written in c++ (trt_pose repo). Feel free to ask me if you have any question. posecpp. Also I've written pose estimator as plugin, so you can easy to mapping with yolo plugin to integrate with deepstream

thancaocuong avatar Nov 10 '20 08:11 thancaocuong

This is example : https://github.com/NVIDIA-AI-IOT/deepstream_pose_estimation

zhink avatar Nov 17 '20 01:11 zhink