model-optimization Tensorflow1.15 and TensorFlow2.4 were used for int8 precision quantization of the same Pb model, and the quantization results were different. Where, the BIAS of float32 bit is the same, but the weight of int8 bit convolution kernel is different

1. System information

tensorflow1.15 、tensorflow2.4

2. Code


    converter = tf.lite.TFLiteConverter.from_saved_model(path)
    converter.post_training_quantize = True
    converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
    tflite_model = converter.convert()

Aug 18 '21 07:08 carryzhang123

Hi could you provide an example?

TF1.15 may have had some issues with bias overflow that have been fixed by adjusting the weight scales.

Aug 23 '21 02:08 daverim

Two tflites was generating from one pb file.One is the tflite of tf1.15,other is the tflite of 2.4.Two things are different. 1、The first layer is conv2d in pb model,but it is depthwiseconv2d of the 1.15 tflite,it is conv2d of the 2.4 tflite. 2、The quantized weight of 1.15 tflite different from the quantized weight of 2.4 tflite. But the lstm weight is same.

Pictures from Netron.Thank you very much.

Aug 23 '21 10:08 carryzhang123

HI @teijeong could you take a look?

Nov 26 '21 01:11 rino20

Not sure about different first layer in 1.15 vs 2.4. For weight difference - is the left result from TF 2.4? Recent TFLite supports per-channel quantization for conv-like ops, so that the filter would have different scale for each output dimension.

Dec 02 '21 05:12 teijeong