YOLOv8-ONNX-TensorRT
YOLOv8-ONNX-TensorRT copied to clipboard
👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera
YOLOv8-ONNX-TensorRT
👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera
🏆 Performance
[!Note]
- Tested on
Nvidia Jetson Orin Nano
⭐ ONNX (CPU)
details
YOLOv8n
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8n.pt | 2 | 535.8 | 37.1 | |
| yolov8n.onnx | FP16 | 7 | 146 | 37 |
YOLOv8s
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8s.pt | 1 | 943.9 | 44.7 | |
| yolov8s.onnx | FP16 | 3 | 347.6 | 44.7 |
YOLOv8m
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8m.pt | 0.5 | 1745.2 | 50.1 | |
| yolov8m.onnx | FP16 | 1.2 | 1126.3 | 50.1 |
YOLOv8l and YOLOv8x were too slow to measure
⭐ TensorRT (GPU)
details
YOLOv8n
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8n.pt | 36 | 21.9 | 37.1 | |
| yolov8n.engine | FP16 | 60 | 7.3 | 37.1 |
| yolov8n.engine | INT8 | 63 | 5.8 | 33 |
YOLOv8s
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8s.pt | 27 | 33.1 | 44.7 | |
| yolov8s.engine | FP16 | 48 | 11.4 | 44.7 |
| yolov8s.engine | INT8 | 57 | 8.2 | 41.2 |
YOLOv8m
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8m.pt | 14 | 66.5 | 50.1 | |
| yolov8m.engine | FP16 | 30 | 23.6 | 50 |
| yolov8m.engine | INT8 | 38 | 17.1 | 46.2 |
YOLOv8l
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8l.pt | 9 | 103.2 | 52.9 | |
| yolov8l.engine | FP16 | 22 | 35.5 | 52.8 |
| yolov8l.engine | INT8 | 31 | 22.4 | 50.1 |
YOLOv8x
| Model | Quantization | FPS | Speed (ms) |
mAPval 50-95 |
|---|---|---|---|---|
| yolov8x.pt | 6 | 160.2 | 54 | |
| yolov8x.engine | FP16 | 15 | 56.6 | 53.9 |
| yolov8x.engine | INT8 | 24 | 33.9 | 51.1 |
[!Note]
- FPS is based on when an object is detected
- Speed average and mAPval values are for single-model single-scale on COCO val2017 dataset
[!Tip]
- You can download the ONNX and TensorRT files from the release
[!Caution]
- Optimizing and exporting models on your own devices will give you the best results
✏️ Prepare
-
Install
CUDA -
Install
PyTorch -
Install if using
TensorRT -
Git clone and Install python requirements
git clone https://github.com/the0807/YOLOv8-ONNX-TensorRT cd YOLOv8-ONNX-TensorRT pip install -r requirements.txt -
Install or upgrade
ultralyticspackage# Install pip install ultralytics # Upgrade pip install -U ultralytics -
Prepare your own datasets with PyTorch weights such as 'yolov8n.pt '
-
(Optional) If you want to test with YOLOv8 base model rather than custom model, please run the code and prepare the
COCOdatasetcd datasets # It will take time to download python3 coco_download.py
[!Important]
Install compatible
PyTorchin theCUDAversion
⚡️ Optional (recommend for high speed)
⭐ Jetson
-
Enable MAX Power Mode and Jetson Clocks
# MAX Power Mode sudo nvpmodel -m 0 # Enable Clocks (Do it again when you reboot) sudo jetson_clocks -
Install Jetson Stats Application
sudo apt update sudo pip install jetson-stats sudo reboot jtop
📚 Usage
⭐ ONNX
details
1. Turn the PyTorch model into ONNX
python3 export_onnx.py --model 'model/yolov8n.pt' --q fp16 --data='datasets/coco.yaml'
Description of all arguments:
--model: required The PyTorch model you trained such asyolov8n.pt--q: Quantization method[fp16]--data: Path to your data.yaml--batch: Specifies export model batch inference size or the max number of images the exported model will process concurrently in predict mode.
2. Real-time camera inference
python3 run_camera.py --model 'model/yolov8n.onnx' --q fp16
Description of all arguments:
--model: The PyTorch model you trained such asyolov8n.onnx--q: Quantization method[fp16]
⭐ TensorRT
details
1. Turn the PyTorch model into TensorRT engine
python3 export_tensorrt.py --model 'model/yolov8n.pt' --q int8 --data='datasets/coco.yaml' --workspace 4 --batch 1
Description of all arguments:
--model: required The PyTorch model you trained such asyolov8n.pt--q: Quantization method[fp16, int8]--data: Path to your data.yaml--batch: Specifies export model batch inference size or the max number of images the exported model will process concurrently in predict mode.--workspace: Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance.
2. Real-time camera inference
python3 run_camera.py --model 'model/yolov8n.engine' --q int8
Description of all arguments:
--model: The PyTorch model you trained such asyolov8n.ptoryolov8n.engine--q: Quantization method[fp16, int8]
[!Important]
- When exporting to
TensorRT(INT8), calibration process is performed using validation data of database. To minimize the loss of mAP, more than 1,000 validation data are recommended if there are at least 300.
[!Tip]
You can get more information
[!Warning]
- If aborted or killed appears, reduce the
--batchand--workspace
🧐 Validation
⭐ ONNX
details
python3 validation.py --model 'model/yolov8n.onnx' --q fp16 --data 'datasets/coco.yaml'
Description of all arguments:
--model: required The PyTorch model you trained such asyolov8n.onnx--q: Quantization method[fp16]--data: Path to your validata.yaml
⭐ TensorRT
details
python3 validation.py --model 'model/yolov8n.engine' --q int8 --data 'datasets/coco.yaml'
Description of all arguments:
--model: required The PyTorch model you trained such asyolov8n.engine--q: Quantization method[fp16, int8]--data: Path to your validata.yaml