ZeroQ increased inference latency for quantized model

increased inference latency for quantized model

Open ZongqiangZhang opened this issue 4 years ago • 0 comments

I hava just reproduced the classification on resnet50+imagenet. The accuracy is excellent!

But there is a significant increase in inference latency for quantized model. Test results on resnet + imagenet + tesla t4:

Does anybody hit the same issue ?

Jul 30 '20 13:07 ZongqiangZhang