TensorRT Smaller pruned model yolov8s doesn't faster than original yolov8s on Tensor RT Jetson Nano

Smaller pruned model yolov8s doesn't faster than original yolov8s on Tensor RT Jetson Nano

Open minhhotboy9x opened this issue 9 months ago • 7 comments

Description

I have 2 model yolov8s and pruned model yolov8s with smaller size. For the second model, I pruned its channel using structural pruning method of Torch pruning. After pruning with the pruning rate of 0.2, I converted both the original and pruned models to onnx and then converted these onnx models to FP16 engine model on Jetson Nano using python. When I test the FPS, the pruned model is not faster than the original model (Both FPS is about 7.4). I also tried with a pruning rate of 0.4 the pruned model's FPS increased to 8.5, but the increased FPS is too low with such a pruning rate. Here is my layer profile of 2 model: yolov8s.txt yolov8s_0,2_pruning.txt

Environment

TensorRT Version: 8.2.1.8 NVIDIA GPU:

NVIDIA Driver Version:

CUDA Version: 10.2 CUDNN Version: 8.2.1.32

Operating System: Ubuntu 18.04 Python Version (if applicable): 3.6 Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

May 21 '24 13:05 minhhotboy9x

TensorRT TensorRT copied to clipboard

Smaller pruned model yolov8s doesn't faster than original yolov8s on Tensor RT Jetson Nano

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard