tensorrtx Zero face detected after trt conversion(Retinaface)

trafficstars

Env

GPU: NVIDIA A10
OS: Ubuntu 20.04.4 LTS
CUDA Version: 11.6
Driver Version: 510.47.03
TensorRT version: 8.0.1.6
gcc version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

I am running this inside a container from nvidia: container docker pull nvcr.io/nvidia/tensorrt:21.07-py3 link

About this repo

repo: tensorrt/retinaface
branch: master (TensorRT 7 API)
retinaface

Your problem

build opencv from source opencv version 4.6.0
pytorch version: torch==1.12.0+cu116, torchaudio==0.12.0+cu116, torchvision==0.13.0+cu116
Followed the instructions in the repo cloned the Pytorch_Retinaface.git and generated the retinaface.wts
here test.jpg file was generated with all faces detected
cloned https://github.com/wang-xinyu/tensorrtx.git repo
after doing cmake, make and serialize, when running inference zero face are detected in worlds_largest_selfie

Expectation: to detect faces after trt conversion

also tried with different trt versions and cuda but still not getting and detections after trt conversion different combination:

container: docker pull nvcr.io/nvidia/tensorrt:21.06-py3, TensorRT-7.2.3.4, torch==1.12.0+cu113, torchaudio==0.12.0+cu113, torchvision==0.13.0+cu113
container: docker pull nvcr.io/nvidia/deepstream:6.1-triton, tensorrt==8.2.5.1, torch==1.12.0+cu116, torchaudio==0.12.0+cu116, torchvision==0.13.0+cu116

even after trying different versions of trt and cuda and pytorch still not getting any detections after trt conversion is done.

Jul 06 '22 07:07 dubeamit

Which pytorch trained model were you using?

Jul 06 '22 12:07 wang-xinyu

Which pytorch trained model were you using?

I used the pretrained model of retinaface Resnet50_Final.pth mentioned in the repo here

Jul 06 '22 15:07 dubeamit

@dubeamit Maybe your pytorch version it too new, at that repo, they were using 1.1. Can you try 1.1 or 1.3?

Jul 07 '22 09:07 wang-xinyu

@wang-xinyu so this worked with pytorch 1.4 and cuda-11.1 on Tesla T4 GPU. But with pytorch 1.4 and cuda 11.3 on A10 GPU it gets stuck in loading model and I can't generate the weight.

So I used the same container on Tesla T4 gpu having TensorRT container image version 20.09 is based on TensorRT 7.1.3 and cuda 11.0.

After getting the retinaface.wts file and running ./retina_r50 -s I get the following error

root@nvmbdprp023198:~/tensorrtx/retinaface/build# ./retina_r50 -s
Loading weights: ../retinaface.wts
Building engine, please wait for a while...
[07/07/2022-11:32:15] [W] [TRT] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
[07/07/2022-11:32:23] [E] [TRT] ../rtSafe/cuda/caskUtils.cpp (98) - Assertion Error in trtSmToCask: 0 (Unsupported SM.)
Build engine successfully!
retina_r50: /root/tensorrtx/retinaface/retina_r50.cpp:251: void APIToModel(unsigned int, nvinfer1::IHostMemory**): Assertion `engine != nullptr' failed.
Aborted (core dumped)

Jul 07 '22 11:07 dubeamit

Since it worked in one of your environment, then it's not the code issue. You need to check the gpu driver, cuda, tensorrt installation, etc.

Jul 07 '22 12:07 wang-xinyu

Which is max version of cuda and tensorrt that this code can support? since I've already tried with docker containers having

TensorRT-8.2.5.1, cuda11.6
TensorRT-7.2.3.4, cuda11.3
TensorRT-8.0.1.6, cuda11.6
TensorRT-7.1.3.4, cuda11.0

in the 4th one I get unsupported error. in others I simply don't get any detection.

for the other combinations by changing the conf_thres to 0.5 I get detections all over the place see the attachment 0_result

Also host machine(A10) has latest cuda and nvidia drivers as mentioned it the first post.

Jul 08 '22 08:07 dubeamit

use https://github.com/wang-xinyu/tensorrtx/tree/master/docker/trt8 to build the docker, which may solve the problem

Jul 15 '22 07:07 snlpatel001213

@snlpatel001213

use https://github.com/wang-xinyu/tensorrtx/tree/master/docker/trt8 to build the docker, which may solve the problem

I tried the above docker but still not getting any detections. cuda version inside container 11.2, in host machine 11.6

while converting to trt I get the following prints

[07/18/2022-17:01:57] [W] [TRT] TensorRT was linked against cuBLAS/cuBLASLt 11.6.5 but loaded cuBLAS/cuBLASLt 11.4.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.1.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuBLAS/cuBLASLt 11.6.5 but loaded cuBLAS/cuBLASLt 11.4.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.1.1

Pytorch version:

torch==1.12.0+cu113
torchaudio==0.12.0+cu113
torchvision==0.13.0+cu113

Jul 18 '22 10:07 dubeamit

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sep 16 '22 10:09 stale[bot]

tensorrtx tensorrtx copied to clipboard

Zero face detected after trt conversion(Retinaface)

Env

About this repo

Your problem

tensorrtx
tensorrtx copied to clipboard