trt-fairmot icon indicating copy to clipboard operation
trt-fairmot copied to clipboard

将ONNX模型转化为TensorRT Engine(sh build_trt.sh)出错

Open PonyMaY opened this issue 2 years ago • 1 comments

您好,在我对该项目的复现中遇到了一些问题,集体体现在onnx转trt的环节中(即构建TensorRT Engine中的Step3:将ONNX模型转化为TensorRT Engine)

我的环境如下:

  • NGC tensorrt:21.02-py3 Docker容器

  • TensorRT 7.2.2

  • Python 3.8.5

  • PyTorch 1.8.1+cu11.1

以上环境与要求版本皆与您的Readme Instructions相同 随后克隆本项目、安装第三方库、下载原版权重后,进入Readme中的使用环节:

  • 1、编译Plugin,得到编译信息如下(无错误),并在build路径中得到动态库DCNv2Plugin.so与DCNv2PluginDyn.so。
cd build; make
make[1]: Entering directory '/home/trt-fairmot/TensorRT_ONNX_impl/build'
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2Plugin.o -c ../plugins/DCNv2Plugin.cpp
/usr/local/cuda/bin/nvcc -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -M -MT obj/DeformConv.o -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DeformConv.d ../plugins/DeformConv.cu
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2PluginDyn.o -c ../plugins/DCNv2PluginDyn.cpp
/usr/local/cuda/bin/nvcc -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -Xcompiler -fPIC -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o obj/DeformConv.o -c ../plugins/DeformConv.cu
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -shared -o DCNv2Plugin.so obj/DCNv2Plugin.o obj/DeformConv.o -L/usr/local/cuda/lib64 -L/usr/local/lib/python3.8/dist-packages/torch/lib/ -L/usr/local/tensorrt7.2-cuda11.1/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudart -lnvinfer -lnvonnxparser -ldl -lpthread -lcuda -ltorch -lc10 -ltorch_cuda -lc10_cuda -ltorch_cpu -ltorch_python
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -shared -o DCNv2PluginDyn.so obj/DCNv2PluginDyn.o obj/DeformConv.o -L/usr/local/cuda/lib64 -L/usr/local/lib/python3.8/dist-packages/torch/lib/ -L/usr/local/tensorrt7.2-cuda11.1/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudart -lnvinfer -lnvonnxparser -ldl -lpthread -lcuda -ltorch -lc10 -ltorch_cuda -lc10_cuda -ltorch_cpu -ltorch_python
make[1]: Leaving directory '/home/trt-fairmot/TensorRT_ONNX_impl/build
  • 2、导出PyTorch模型为ONNX模型:成功得到fairmot.onnx与fairmot_plugin.onnx模型,但在运行python build_onnx_engine.py指令时中间提示DCNv2是未定义的算子(个人认为是正常的)

  • 3、将ONNX模型转化为TensorRT Engine:

此处出错:运行sh build_trt.sh指令时,得到第一条Error信息如下: [E] Could not load plugin library: ./build/DCNv2PluginDyn.so, due to: libc10.so: cannot open shared object file: No such file or directory 网络上搜寻此错误的原因可能时torch/lib中缺少此libc10.so动态库,但本人确认此so是确实存在的。 随后导致terexec失败,DCN层未能成功转换,后续错误附上:

[04/12/2022-15:43:01] [I] [TRT] No importer registered for op: DCNv2Plugin. Attempting to import as plugin.
[04/12/2022-15:43:01] [I] [TRT] Searching for plugin: DCNv2Plugin, plugin_version: 1, plugin_namespace: 
[04/12/2022-15:43:01] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin DCNv2Plugin version 1
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:705: While parsing node number 97 [DCNv2Plugin -> "534"]:
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:706: --- Begin node ---
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:707: input: "529"

以上为本错误的具体复现过程,恳请您给出解答或您的想法,万分感谢!

PonyMaY avatar Apr 12 '22 16:04 PonyMaY

我也遇到了同样的问题,请问你解决了吗,另外可以加个qq交流以下吗。qq3335827554

hujxjjsjajs avatar May 08 '22 13:05 hujxjjsjajs