tensorrtx
tensorrtx copied to clipboard
Yolov5s' INT8 model gave very poor result.
Env
- GPU Nvidia AGX Xavier
- OS, Ubuntu18.04
- Cuda version 10.2
- TensorRT 7.1.3
About this repo
- tag: v5.0
- model: yolov5
Your problem
I followed both FP16/32 steps and the INT8 Quantization steps for model Yolov5s and successfully created engines. The FP16 model gave very good result (almost no mAP drop compared to FP32). But INT8's is very poor and inference speed is almost same as FP16 version. For the two sample images, I got some detected boxes (for true targets) with very low confidence score like 0.08. Do you have any recommendations ? Could there be any extra step that was not mentioned in yolov5 section ?
INT8 has lower accuracy, this is normal, maybe you can try other calib data or more calid data, or try INT8 QAT method.
The inference speed depends on your device, the jetson might not have good performance for INT8. You can try GTX/RTX, etc.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
INT8 has lower accuracy, this is normal, maybe you can try other calib data or more calid data, or try INT8 QAT method.
The inference speed depends on your device, the jetson might not have good performance for INT8. You can try GTX/RTX, etc.
Please try to use our quantization tool ppq, which currently supports tensorrt, ncnn, openvino, snpe, etc. https://github.com/openppl-public/ppq/blob/master/md_doc/deploy_trt_by_api.md