ZeroQ icon indicating copy to clipboard operation
ZeroQ copied to clipboard

increased inference latency for quantized model

Open ZongqiangZhang opened this issue 4 years ago • 0 comments

I hava just reproduced the classification on resnet50+imagenet. The accuracy is excellent!

But there is a significant increase in inference latency for quantized model. Test results on resnet + imagenet + tesla t4:

  • test(model, test_loader) takes 143 seconds
  • test(quantized_model, test_loader) takes 1442 seconds

Does anybody hit the same issue ?

ZongqiangZhang avatar Jul 30 '20 13:07 ZongqiangZhang