trt_pose icon indicating copy to clipboard operation
trt_pose copied to clipboard

Models with higher input size

Open lweicker opened this issue 4 years ago • 3 comments

I'm really impressed with the results obtained so far. I'm wondering if you plan to train and publish models but with higher input size? I ask that because I see in tasks/human_pose/experiments/ that you experimented on higher input image sizes like 368x368, 384x384.

lweicker avatar Dec 02 '20 20:12 lweicker

Plz, how can we use TRT-pose for high resolution data (e.g. 1920 x 1080 images)?

Thanks in advance.

Azzedine-Touazi avatar Dec 09 '20 19:12 Azzedine-Touazi

Hi all,

We did a few experiments with higher resolutions, but for our primary use case (pose detection within a few meters using an Raspberry pi camera), we didn't find much qualitative improvements, and the runtime increased. If you want to detect objects within a few meters using a typical camera (say 50+degree FOV), I'd recommend just downscaling the image before providing to the neural network.

That said, you can run the existing pre-trained models at higher resolution, but it will change the effective size range of objects you detect. To do this, you need to adjust the size of the input data you provide to the model when optimizing with TensorRT. This has not been thoroughly tested, so your results may vary.

Please let me know if this helps or you have further questions.

Best, John

jaybdub avatar Dec 09 '20 20:12 jaybdub

Hi John,

Thanks for your answer. The inference with higher input size indeed works as you explained. After running multiple tests, I however have the feeling that the quality of the predictions is not as good as resizing first to either 224224 or 256256 (depending on the model used) and then infer.

It also led me to an other question. During the training, how did you resize the images? Did you squish them or cropped them?

Best,

Lionel

lweicker avatar Dec 11 '20 10:12 lweicker