ros_deep_learning
ros_deep_learning copied to clipboard
Using custom trained model with ROS Deep Learning
Hi @dusty-nv ,
Thank you for this amazing project. I am trying to use a re-trained YOLOV8 model (re-trained using Ultralytics Library on VisDrone Dataset) in onnx format with your ROS Deep Learning framework. I have put my model in this path: "../jetson-inference/python/training/detection/ssd/models/YOLOV8/best.onnx" and trying to run inference using the following command: "roslaunch ros_deep_learning detectnet.ros1.launch model_path:="../jetson-inference/python/training/detection/ssd/models/YOLOV8/best.onnx" input:=file://home/jrvis/Downloads/IMG_9316.mp4 output:=file://home/jrvis/Downloads/output1.mp4"
However, I am getting this error:
[TRT] loading network plan from engine cache... ../jetson-inference/python/training/detection/ssd/models/YOLOV8/best.onnx.1.1.8201.GPU.FP16.engine
[TRT] device GPU, loaded ../jetson-inference/python/training/detection/ssd/models/YOLOV8/best.onnx
[TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 274, GPU 3452 (MiB)
[TRT] Loaded engine size: 23 MiB
[TRT] Using cublas as a tactic source
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +89, now: CPU 438, GPU 3448 (MiB)
[TRT] Using cuDNN as a tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +240, GPU -5, now: CPU 678, GPU 3443 (MiB)
[TRT] Deserialization required 5478886 microseconds.
[TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +22, now: CPU 0, GPU 22 (MiB)
[TRT] Using cublas as a tactic source
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 678, GPU 3447 (MiB)
[TRT] Using cuDNN as a tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 678, GPU 3447 (MiB)
[TRT] Total per-runner device persistent memory is 22895104
[TRT] Total per-runner host persistent memory is 117824
[TRT] Allocated activation device memory of size 52026880
[TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +71, now: CPU 0, GPU 93 (MiB)
[TRT]
[TRT] CUDA engine context initialized on device GPU:
[TRT] -- layers 184
[TRT] -- maxBatchSize 1
[TRT] -- deviceMemory 52026880
[TRT] -- bindings 2
[TRT] binding 0
-- index 0
-- name 'images'
-- type FP32
-- in/out INPUT
-- # dims 4
-- dim #0 1
-- dim #1 3
-- dim #2 1504
-- dim #3 1504
[TRT] binding 1
-- index 1
-- name 'output0'
-- type FP32
-- in/out OUTPUT
-- # dims 3
-- dim #0 1
-- dim #1 14
-- dim #2 46389
[TRT]
[TRT] 3: Cannot find binding of given name:
[TRT] failed to find requested input layer in network
[TRT] device GPU, failed to create resources for CUDA engine
[TRT] failed to create TensorRT engine for ../jetson-inference/python/training/detection/ssd/models/YOLOV8/best.onnx, device GPU
[TRT] detectNet -- failed to initialize.
[ERROR] [1699466176.794739503]: failed to load detectNet model
Is what I am trying to achieve possible in this project? If yes, what am I doing wrong?