tensorrtx icon indicating copy to clipboard operation
tensorrtx copied to clipboard

Zero face detected after trt conversion(Retinaface)

Open dubeamit opened this issue 3 years ago • 9 comments
trafficstars

Env

  • GPU: NVIDIA A10
  • OS: Ubuntu 20.04.4 LTS
  • CUDA Version: 11.6
  • Driver Version: 510.47.03
  • TensorRT version: 8.0.1.6
  • gcc version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

I am running this inside a container from nvidia: container docker pull nvcr.io/nvidia/tensorrt:21.07-py3 link

About this repo

Your problem

  1. build opencv from source opencv version 4.6.0

  2. pytorch version: torch==1.12.0+cu116, torchaudio==0.12.0+cu116, torchvision==0.13.0+cu116

  3. Followed the instructions in the repo cloned the Pytorch_Retinaface.git and generated the retinaface.wts

  4. here test.jpg file was generated with all faces detected

  5. cloned https://github.com/wang-xinyu/tensorrtx.git repo

  6. after doing cmake, make and serialize, when running inference zero face are detected in worlds_largest_selfie

Expectation: to detect faces after trt conversion

also tried with different trt versions and cuda but still not getting and detections after trt conversion different combination:

  1. container: docker pull nvcr.io/nvidia/tensorrt:21.06-py3, TensorRT-7.2.3.4, torch==1.12.0+cu113, torchaudio==0.12.0+cu113, torchvision==0.13.0+cu113
  2. container: docker pull nvcr.io/nvidia/deepstream:6.1-triton, tensorrt==8.2.5.1, torch==1.12.0+cu116, torchaudio==0.12.0+cu116, torchvision==0.13.0+cu116

even after trying different versions of trt and cuda and pytorch still not getting any detections after trt conversion is done.

dubeamit avatar Jul 06 '22 07:07 dubeamit

Which pytorch trained model were you using?

wang-xinyu avatar Jul 06 '22 12:07 wang-xinyu

Which pytorch trained model were you using?

I used the pretrained model of retinaface Resnet50_Final.pth mentioned in the repo here

dubeamit avatar Jul 06 '22 15:07 dubeamit

@dubeamit Maybe your pytorch version it too new, at that repo, they were using 1.1. Can you try 1.1 or 1.3?

wang-xinyu avatar Jul 07 '22 09:07 wang-xinyu

@wang-xinyu so this worked with pytorch 1.4 and cuda-11.1 on Tesla T4 GPU. But with pytorch 1.4 and cuda 11.3 on A10 GPU it gets stuck in loading model and I can't generate the weight.

So I used the same container on Tesla T4 gpu having TensorRT container image version 20.09 is based on TensorRT 7.1.3 and cuda 11.0.

After getting the retinaface.wts file and running ./retina_r50 -s I get the following error

root@nvmbdprp023198:~/tensorrtx/retinaface/build# ./retina_r50 -s
Loading weights: ../retinaface.wts
Building engine, please wait for a while...
[07/07/2022-11:32:15] [W] [TRT] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
[07/07/2022-11:32:23] [E] [TRT] ../rtSafe/cuda/caskUtils.cpp (98) - Assertion Error in trtSmToCask: 0 (Unsupported SM.)
Build engine successfully!
retina_r50: /root/tensorrtx/retinaface/retina_r50.cpp:251: void APIToModel(unsigned int, nvinfer1::IHostMemory**): Assertion `engine != nullptr' failed.
Aborted (core dumped)

dubeamit avatar Jul 07 '22 11:07 dubeamit

Since it worked in one of your environment, then it's not the code issue. You need to check the gpu driver, cuda, tensorrt installation, etc.

wang-xinyu avatar Jul 07 '22 12:07 wang-xinyu

Which is max version of cuda and tensorrt that this code can support? since I've already tried with docker containers having

  1. TensorRT-8.2.5.1, cuda11.6
  2. TensorRT-7.2.3.4, cuda11.3
  3. TensorRT-8.0.1.6, cuda11.6
  4. TensorRT-7.1.3.4, cuda11.0

in the 4th one I get unsupported error. in others I simply don't get any detection.

for the other combinations by changing the conf_thres to 0.5 I get detections all over the place see the attachment 0_result

Also host machine(A10) has latest cuda and nvidia drivers as mentioned it the first post.

dubeamit avatar Jul 08 '22 08:07 dubeamit

use https://github.com/wang-xinyu/tensorrtx/tree/master/docker/trt8 to build the docker, which may solve the problem

snlpatel001213 avatar Jul 15 '22 07:07 snlpatel001213

@snlpatel001213

use https://github.com/wang-xinyu/tensorrtx/tree/master/docker/trt8 to build the docker, which may solve the problem

I tried the above docker but still not getting any detections. cuda version inside container 11.2, in host machine 11.6

while converting to trt I get the following prints

[07/18/2022-17:01:57] [W] [TRT] TensorRT was linked against cuBLAS/cuBLASLt 11.6.5 but loaded cuBLAS/cuBLASLt 11.4.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.1.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuBLAS/cuBLASLt 11.6.5 but loaded cuBLAS/cuBLASLt 11.4.1
[07/18/2022-17:02:00] [W] [TRT] TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.1.1

Pytorch version:

torch==1.12.0+cu113
torchaudio==0.12.0+cu113
torchvision==0.13.0+cu113

dubeamit avatar Jul 18 '22 10:07 dubeamit

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 16 '22 10:09 stale[bot]