yolor INT 8 quantization support

INT 8 quantization support

Open shekarneo opened this issue 3 years ago • 1 comments

Feb 22 '22 06:02 shekarneo

I managed to quantize the model using NVIDIA/pytorch-quantization. From my experiments, the accuracy drop is around 3%, gpu memory only reduced by 10% and the speed (PTQ->TensorRT-FP16-INT8) is close to the TensorRT-FP16 (no PTQ). Personally, It doesn't really help much.

Mar 04 '22 10:03 haritsahm

yolor yolor copied to clipboard

INT 8 quantization support

yolor
yolor copied to clipboard