TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

Inference accuracy

Open frankvp11 opened this issue 3 years ago • 19 comments

Description

When I use infer.py on a sample image that was given in the TensorRT samples it gives me totally wrong predictions. I used the infer.py that was suggested to me by azhurkevich in the tensorflow_object_detection_api using a model.onnx that was given to me by him (which i then used in create_onnx.py in the detectron2 directory.) I can supply the converted model. Upon checking it in netron (it looked all fine). However when I opened the outputs from the inference, this is what I got.

image image.txt

Environment

TensorRT Version: 8.2.1 NVIDIA GPU: Jetson TX2 NVIDIA Driver Version: Jetpack 4.6.2 (IDK) CUDA Version: 10.2 CUDNN Version: 8.2.1 Operating System: Ubuntu 18.04 Python Version (if applicable): 3.6.9 Tensorflow Version (if applicable): NA PyTorch Version (if applicable): 1.9.0 Baremetal or Container (if so, version): container

Relevant Files

https://drive.google.com/file/d/1RZIBnvBtND2uqQ3a3cxUcT7kJOf61mMK/view?usp=sharing --- converted.onnx file I used (from after create_onnx.py) And TensorRT Repo dockerfile.txt -- dockerfile I used

Steps To Reproduce

It was in a docker container. Which I pasted above. Then I ran the commands I specified above. The error was the picture looking awful.

frankvp11 avatar Aug 10 '22 15:08 frankvp11

@azhurkevich ^ ^

zerollzeng avatar Aug 11 '22 12:08 zerollzeng

It seems to me you are mixing tfod and detectron2 samples, bad idea. I cannot foresee what will happen if you mix 2 samples together. Plus I don't really understand what you are doing.

azhurkevich avatar Aug 11 '22 13:08 azhurkevich

What do you mean mixing samples? The only thing i used tfod for was inference. That's because I am unable to get Detectron2 onto my Jetson TX2 because it runs TensorRT preinstalled on python3.6. What I am trying to do is do inference on Detectron2 using the optimizations that TensorRT brings. @azhurkevich

frankvp11 avatar Aug 11 '22 13:08 frankvp11

Why don't you verify with detectron2 sample on your PC and check what evaluation and inference results are following Detectron2 readme? Inference in the sample is purely for checking purposes not real world inference. This python inference will significantly slow you down.

The main reason you don't get proper results is your TRT version is old 8.2.1. In detectron2 readme is says that TRT must be >= 8.4.1, I also mentioned it to you in your initial issue.

Getting back to proper inference. Here is what you should do: convert detectron2 model to TensorRT on your PC, evaluate on your PC (using detectron2 code and TRT 8.4.1) and check if you are getting 40.2 mAP, you can also infer a couple images after that if you want. After that you should figure out how to install TRT 8.4.1 on Jetson device replacing the default one in JetPack. This was originally mentioned. After you have done that you should look into DeepStream SDK and build efficient inference pipeline (using converted on your PC ONNX and built from it TRT engine on a Jetson). Please don't ask me how to use DeepStream, I have not done that yet, so I simply do not know. With DeepStream you will get best inference perf possible which is critical if you want to run on Jetson. In addition, I recommend using DeepStream C++ SDK to get best perf, you will loose some perf due to python interpreter if using python SDK.

azhurkevich avatar Aug 11 '22 13:08 azhurkevich

Ok I'll give it a shot from there - Do i need to rebuild the converted.onnx after I've already done that? With that, I did that on Google Colab if it makes a difference.

frankvp11 avatar Aug 11 '22 13:08 frankvp11

To make it easier just take the model that I've shared with you and start from create_onnx.py step. Please follow the instructions and requirements, this is very important

azhurkevich avatar Aug 11 '22 14:08 azhurkevich

Ok, will do. On my other PC, do I need to have a GPU for this to work?

frankvp11 avatar Aug 11 '22 14:08 frankvp11

You will not be able to use TRT without GPU

azhurkevich avatar Aug 11 '22 14:08 azhurkevich

So it can't even work on my PC? This was what I was afraid of when you mentioned PC

frankvp11 avatar Aug 11 '22 14:08 frankvp11

Any other ideas @azhurkevich? Because I do not currently have access to a PC with GPU. or @zerollzeng Edit: Do you guys think that it's possible I just run it in a docker container on the Jetson and try it there? Like would it be possible to run a docker container containing TensorRT 8.4.1? Because I went to the NVIDIA developer forums and asked them for advice on how I would go about editing the files that I flash with and they said it's not possible, supported or suggested, and that their advice would be to buy a new board (which won't be happening). What do you think of that idea @azhurkevich ? Do you think it could work?

frankvp11 avatar Aug 11 '22 15:08 frankvp11

Because I take it its not suggested or easy to rebuild/reinstall TensorRT onto the Jetson device after having flashed it? @azhurkevich

Edit: Never mind I've tried the docker image thing - It didn't work.

frankvp11 avatar Aug 12 '22 13:08 frankvp11

It is possible if you install all requirements on Jetson. Meaning not only TRT 8.4.1, but python libraries too. I suspect a lot of them will not play well with ARM out of the box, so you will have to build some of them from source most likely. That's why Jetson should be used only for deployment in this case.

If you don't have a GPU on your PC, you can probably get some cheap cloud instances and do work there.

azhurkevich avatar Aug 12 '22 14:08 azhurkevich

I don't know how to upgrade TRT version on JetPack, so yeah, cannot help with this much. Should be doable

azhurkevich avatar Aug 12 '22 14:08 azhurkevich

Hmm, and theres no other way? You don't happen to have an engine.trt I can test around with? Because i've been running into lots of problems, and don't really have access to cloud instances. Also i've been trying the suggested advice from https://github.com/NVIDIA/TensorRT/issues/2029, but i'm getting cmake issues which I am working to resolve

frankvp11 avatar Aug 12 '22 14:08 frankvp11

Which GA should I get from https://developer.nvidia.com/nvidia-tensorrt-8x-download @azhurkevich ? None of them really are for ARM Ubuntu 18.04, as I've tried to use the Docker containers provided but they use up to much disc space so I can't-even with base flash.

frankvp11 avatar Aug 12 '22 15:08 frankvp11

You have to build TRT engine yourself because it is device specific and not portable. I am sure if you have an internet you can get an access to cloud, it's up to you though. I think you are over complicating things. You should use TRT 8.4.1, other than that I have nothing to say, I have not tested this sample on Jetson device so I will not point you to specific file and say "use this". Rest of the issues you should figure out on your own because they are not related to TRT specifically.

azhurkevich avatar Aug 12 '22 16:08 azhurkevich

So basically it's not realistically possible for my use case? TX2 doesn't have enough disk space for re-install and it's not realistically possible to reinstall it and it according to devs on developer forum isn't possible to flash with higher version?

frankvp11 avatar Aug 12 '22 16:08 frankvp11

Also I know you don't like Google colab-but is that a viable option aswell? It has access to GPU's on there, I just am wondering if you know how to get TensorRT on there?

frankvp11 avatar Aug 12 '22 16:08 frankvp11

Nothing is impossible, in your case it will just take more time to figure out. Your approach is not the best one in my opinion. I've already mentioned what you should do ideally. If you don't like it, do it your way, but be prepared to take responsibility for issues that you might face down the line. In this context by responsibility I mean you figuring out issues on your own. Rest of the details and the whole process is up to you to go through.

I don't use colab so I have no idea. You can literally get cloud instances and do this work for free if you don't have a GPU, I will not point you where you should go. In addition, you are asking too many questions that are not related to TensorRT, you should figure them out on your own.

azhurkevich avatar Aug 12 '22 16:08 azhurkevich

mask-rcnn detectron2 [PluginV2Ext]_output_0: tensor volume exceeds (2^31)-1

zhuimeng2080 avatar Dec 04 '22 11:12 zhuimeng2080

where is the shared converted.onnx according to the tutorial readme.cmd

zhuimeng2080 avatar Dec 04 '22 14:12 zhuimeng2080

Idk if it will help, but I created a medium article about this and how I was able to optimize Detectron2 with TensorRT for a Jetson TX2. Ive linked it here. https://medium.com/@frankvanpaassen3/how-to-optimize-custom-detectron2-models-with-tensorrt-2cd710954ad3

frankvp11 avatar Dec 04 '22 15:12 frankvp11

@zhuimeng2080 I've updated the converter in October it works out of the box rn, please follow instructions. You are hitting fundamental TRT tensor size limitation, most likely you used a very big batch size. Try it with bs 1 and gradually increase size. Converted ONNX will be a result of this step.

azhurkevich avatar Dec 05 '22 17:12 azhurkevich

for failed to convert trt of converted.onnx, is here below:

converted.onnx 链接: https://pan.baidu.com/s/1Le5k0MjM6-yxcCVBUeNrtQ 提取码: yw2k 复制这段内容后打开百度网盘手机App,操作更方便哦 --来自百度网盘超级会员v1的分享

zhuimeng2080 avatar Dec 08 '22 07:12 zhuimeng2080

@zhuimeng2080 cannot access anything, can you upload to google drive?

azhurkevich avatar Dec 08 '22 08:12 azhurkevich

closing inactive issues, thanks all!

ttyio avatar Dec 05 '23 21:12 ttyio