TensorRT
TensorRT copied to clipboard
why the yolov8 int8 quant using pytorch_quant is slower than trt --fp16 quant
devicec : nvidia NX
1.using trt --fp16
/usr/src/tensorrt/bin/trtexec --onnx=best.onnx --workspace=4096 --saveEngine=best.engine --fp16
the result of infer speed is 36.8ms
2. using pytorch_quant int8
/usr/src/tensorrt/bin/trtexec --onnx=best.onnx --saveEngine=v8s_ptq.engine --int8 --workspace=4096
the result of infer speed is : 39.5ms
### Tasks