TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

TensorRT 8.5.2.2 GPU AGX Xavier Jetson 5.1 - Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv.)

Open HichTala opened this issue 2 years ago • 6 comments

Description

Hi everyone,

I am currently working on converting a YOLOv5 ONNX model to TensorRT on my AGX Xavier running Jetson 5.1. Unfortunately, I’ve encountered an issue that I’m struggling to resolve,

Here’s the error message I’m encountering:

[12/12/2023-09:55:41] [TRT] [E] 10: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv.)

Environment

I am using TensorRT version 8.5.2.2 with CUDA 11.4. If necessary, I can share the ONNX model.

Any insights or guidance on how to resolve this issue would be greatly appreciated. Thank you in advance for your time and assistance.

HichTala avatar Dec 12 '23 12:12 HichTala

Hi, If you are the same guy as in this post, follow what the method I suggested over there. If you are not, even then follow the method suggested on the forum.

https://forums.developer.nvidia.com/t/tensorrt-trying-to-convert-an-onnx-model-to-tensorrt/275818

RajUpadhyay avatar Dec 13 '23 00:12 RajUpadhyay

  1. Does other model work?
  2. Can you try increase workspace size?
  3. If still cannot fix it, please share the onnx here.

Thanks!

zerollzeng avatar Dec 17 '23 05:12 zerollzeng

Hi, sorry for the late reply,

@RajUpadhyay I am effectively the same guy as in the other post. The method you shared is the same I used in the first place except the fact that I am using an older script because my yolov5 has been trained on an older version (here is the old version of yolo that I am using). It is one of my collegue that trained the model before I came and he is not there anymore and I don't have the data for the moment, the only thing I have is the trained model attached to this message also. Update: I tried with a newer version of yolov5 with one of their pre-trained models and I still get the error, maybe it's related to the GPU card which is an AGX Xavier and not a "traditional one"?

@zerollzeng Other models give the same error, haven't tried models without convolution tho. I tried increasing the workspace but it still gives the same error message...

I didn't precised but I am trying to calibrate my model using this script here is the command I run

python ./trt_quant/convert_tqt_quant.py --img-dir val2017/ --img-size 512 --batch-size 1 --batch 50 --onnx-model last.onnx

Thank you for your help I really appreciate

last.zip

HichTala avatar Dec 19 '23 15:12 HichTala

I feel like it maybe a env issue, could you please try to flash the latest JP 6.0 and try again? We won't fix bug on TRT 8.5 now.

zerollzeng avatar Dec 27 '23 13:12 zerollzeng

I didn't reproduce the issue with polygraphy?

[I] Finished engine building in 434.928 seconds
nvidia@tegra-ubuntu:~/scratch.zeroz_sw/github_bug/3545$ polygraphy convert last.onnx --int8 -o out.plan

zerollzeng avatar Dec 27 '23 13:12 zerollzeng

Hi, sorry for the late reply,

@RajUpadhyay I am effectively the same guy as in the other post. The method you shared is the same I used in the first place except the fact that I am using an older script because my yolov5 has been trained on an older version (here is the old version of yolo that I am using). It is one of my collegue that trained the model before I came and he is not there anymore and I don't have the data for the moment, the only thing I have is the trained model attached to this message also. Update: I tried with a newer version of yolov5 with one of their pre-trained models and I still get the error, maybe it's related to the GPU card which is an AGX Xavier and not a "traditional one"?

@zerollzeng Other models give the same error, haven't tried models without convolution tho. I tried increasing the workspace but it still gives the same error message...

I didn't precised but I am trying to calibrate my model using this script here is the command I run

python ./trt_quant/convert_tqt_quant.py --img-dir val2017/ --img-size 512 --batch-size 1 --batch 50 --onnx-model last.onnx

Thank you for your help I really appreciate

last.zip

I tried to convert your onnx file. I was able to generate the trt engine without any failure. Although I did do it on my x86 pc ubuntu 22.04 on deepstream 6.4. I generated it using trtexec tool.

Unfortunately I won't be able to give it a try on my jetson which has Jetpack 5.1.2 (trt version 8.5.2), since holidays, sorry. So I am unsure if TensorRT 8.6 is what solves this error but since you have agx xavier, I do not think you can upgrade to JP 6.0 anyway.

Although can you try one thing? Why don't you use docker image on your jetson to check if it is infact an env issue? You can go to jetson-containers git repo by dusty_nv and run a docker image for deepstream sdk. Then run this command: ./trtexec --onnx=last.onnx --saveEngine=engine_fp16.engine --fp16 --useCudaGraph --verbose

Here is the link: https://github.com/dusty-nv/jetson-containers

RajUpadhyay avatar Dec 27 '23 23:12 RajUpadhyay

closing since no activity for more than 3 weeks per our policy, thanks all!

ttyio avatar May 07 '24 18:05 ttyio