tensorrtx icon indicating copy to clipboard operation
tensorrtx copied to clipboard

YOLOv9 build error

Open reaganch opened this issue 1 year ago • 13 comments

Env

  • GPU, e.g. V100, RTX2080, TX2, Xavier NX, Nano, etc. GTX 1070

  • OS, e.g. Ubuntu16.04, Win10, etc. Linux Mint 21.3

  • Cuda version Cuda version 12.3

  • TensorRT version TensorRT version 10.0

About this repo

  • which branch/tag/commit are you using? Latest

  • which model? yolov5, retinaface? yolov9

Your problem

  • what is your command? e.g. sudo ./yolov5 -s I'm running make

  • what's your output? I get the following error:

% make
[ 20%] Built target myplugins
[ 30%] Building CXX object CMakeFiles/yolov9.dir/demo.cpp.o
In file included from /usr/local/include/opencv4/opencv2/core/vsx_utils.hpp:11,
                 from /usr/local/include/opencv4/opencv2/core/base.hpp:661,
                 from /usr/local/include/opencv4/opencv2/core.hpp:53,
                 from /usr/local/include/opencv4/opencv2/opencv.hpp:52,
                 from /home/cricket/build/tensorrtx/yolov9/include/postprocess.h:4,
                 from /home/cricket/build/tensorrtx/yolov9/demo.cpp:7:
/home/cricket/build/tensorrtx/yolov9/demo.cpp: In function ‘void prepare_buffer(nvinfer1::ICudaEngine*, float**, float**, float**)’:
/home/cricket/build/tensorrtx/yolov9/demo.cpp:72:20: error: ‘class nvinfer1::ICudaEngine’ has no member named ‘getNbBindings’
   72 |     assert(engine->getNbBindings() == 2);
      |                    ^~~~~~~~~~~~~
/home/cricket/build/tensorrtx/yolov9/demo.cpp:75:36: error: ‘class nvinfer1::ICudaEngine’ has no member named ‘getBindingIndex’
   75 |     const int inputIndex = engine->getBindingIndex(kInputTensorName);
      |                                    ^~~~~~~~~~~~~~~
/home/cricket/build/tensorrtx/yolov9/demo.cpp:76:37: error: ‘class nvinfer1::ICudaEngine’ has no member named ‘getBindingIndex’
   76 |     const int outputIndex = engine->getBindingIndex(kOutputTensorName);
      |                                     ^~~~~~~~~~~~~~~
/home/cricket/build/tensorrtx/yolov9/demo.cpp: In function ‘void infer(nvinfer1::IExecutionContext&, CUstream_st*&, void**, float*, int)’:
/home/cricket/build/tensorrtx/yolov9/demo.cpp:88:13: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘enqueue’; did you mean ‘enqueueV3’?
   88 |     context.enqueue(batchSize, buffers, stream, nullptr);
      |             ^~~~~~~
      |             enqueueV3
make[2]: *** [CMakeFiles/yolov9.dir/build.make:76: CMakeFiles/yolov9.dir/demo.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:111: CMakeFiles/yolov9.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
  • what output do you expect? I expect the build to complete without any errors.

reaganch avatar Apr 06 '24 23:04 reaganch

I presume this may be because TensorRTX needs an older version of TensorRT installed? Would be great if you could confirm. Thanks!

reaganch avatar Apr 07 '24 02:04 reaganch

Yes, try to use TensorRT <= 8.5

wang-xinyu avatar Apr 07 '24 07:04 wang-xinyu

Thanks for that. Will give it a shot. Cheers!

reaganch avatar Apr 07 '24 08:04 reaganch

Just installed TensorRT version 8.5 GA Update 2. This required installing CUDA version 11.8.0 and cuDNN version 8.9.7. I had previously installed CUDA version 12.3.2 to build OpenCV, so I currently have two installations of CUDA on my system. When I try to build TensorRTX for yolov9, I now get the following error when I run cmake ... Could you please advise what I may be doing wrong here and why it seems to be requiring CUDA 12.3?

% cmake ..
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- operation system is Linux-6.5.0-26-generic
-- current platform: Linux 
-- The CUDA compiler identification is NVIDIA 11.8.89
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
CMake Warning (dev) at /usr/local/lib/cmake/opencv4/OpenCVConfig.cmake:86 (find_package):
  Policy CMP0146 is not set: The FindCUDA module is removed.  Run "cmake
  --help-policy CMP0146" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

Call Stack (most recent call first):
  /usr/local/lib/cmake/opencv4/OpenCVConfig.cmake:108 (find_host_package)
  CMakeLists.txt:41 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Error at /home/cricket/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find CUDA: Found unsuitable version "11.8", but required is exact
  version "12.3" (found /usr/local/cuda-11.8)
Call Stack (most recent call first):
  /home/cricket/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:598 (_FPHSA_FAILURE_MESSAGE)
  /home/cricket/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.29/Modules/FindCUDA.cmake:1291 (find_package_handle_standard_args)
  /usr/local/lib/cmake/opencv4/OpenCVConfig.cmake:86 (find_package)
  /usr/local/lib/cmake/opencv4/OpenCVConfig.cmake:108 (find_host_package)
  CMakeLists.txt:41 (find_package)


-- Configuring incomplete, errors occurred!

Thanks!

reaganch avatar Apr 07 '24 09:04 reaganch

Seems your opencv is linking cuda12.3, you can try to use docker

wang-xinyu avatar Apr 07 '24 11:04 wang-xinyu

Ah, I see. Thanks for the suggestion. Will give that a shot.

reaganch avatar Apr 07 '24 11:04 reaganch

root@b37d8b2aacd1:/workspace/tensorrtx/yolov9/build# sudo ./yolov9 -s ../yolov9-c.wts yolov9-c.engine c [04/22/2024-13:15:57] [W] [TRT] The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible. Loading weights: ../yolov9-c.wts Your platform support int8: true Building engine, please wait for a while... reading calib cache: int8calib.table [04/22/2024-13:16:02] [W] [TRT] TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.5.0 [04/22/2024-13:16:03] [W] [TRT] TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.5.0 [04/22/2024-13:16:03] [W] [TRT] TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.5.0 [04/22/2024-13:16:03] [E] [TRT] 1: Unexpected exception _Map_base::at [04/22/2024-13:16:03] [E] [TRT] 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) Build engine successfully! yolov9: /workspace/tensorrtx/yolov9/demo.cpp:31: void serialize_engine(unsigned int, std::string&, std::string&, std::string&): Assertion `serialized_engine != nullptr' failed. Aborted

zmtttt avatar Apr 22 '24 05:04 zmtttt

@zmtttt have you check this path https://github.com/wang-xinyu/tensorrtx/blob/d4aa52db68c36d10cfcb2fd9a818faf2d82bfd00/yolov9/include/config.h#L13

wang-xinyu avatar Apr 22 '24 05:04 wang-xinyu

@WuxinrongY Can we make yolov9 to use fp16 by default?

wang-xinyu avatar Apr 22 '24 05:04 wang-xinyu

@WuxinrongY Can we make yolov9 to use fp16 by default?

WuxinrongY avatar Apr 22 '24 08:04 WuxinrongY

@wang-xinyu ,谢谢,不过const static char* gCalibTablePath = "/home/zhaomt/com/tensorrtx/yolov9/calib/coco_calib",,还是同样的错误

zmtttt avatar Apr 23 '24 04:04 zmtttt

@wang-xinyu ,谢谢,不过const static char* gCalibTablePath = "/home/zhaomt/com/tensorrtx/yolov9/calib/coco_calib",,还是同样的错误

这个路径后面要加“/”,比如const static char* gCalibTablePath = "/home/zhaomt/com/tensorrtx/yolov9/calib/coco_calib/"

WuxinrongY avatar Apr 23 '24 06:04 WuxinrongY

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 06 '24 01:07 stale[bot]