inference icon indicating copy to clipboard operation
inference copied to clipboard

Query: How to run BERT INT8 TF model

Open avinashcpandey opened this issue 3 years ago • 1 comments

Hi,

I am trying to run BERT INT8 with TF backend. However, I don't see TF INT8 model info in below link.
https://github.com/mlcommons/inference/tree/master/language/bert

Any help on how to run will be highly appreciated.

avinashcpandey avatar Oct 13 '22 11:10 avinashcpandey

Not all quantized models are added in the inference repo. You're free to take any fp32 model and do your own quantization method. More details regarding retraining can be found here

arjunsuresh avatar Oct 15 '22 18:10 arjunsuresh