model-optimization icon indicating copy to clipboard operation
model-optimization copied to clipboard

Weight in fully connected layers don't follow tensorflow quantization spec (zero-point!=0)

Open bhbruce opened this issue 3 years ago • 3 comments

1. System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.5 LTS
  • TensorFlow installation (pip package or built from source): pip
  • TensorFlow library (version, if pip package or github SHA, if built from source): tensorflow2.5.0

2. Code

Provide code to help us reproduce your issues using one of the following options:

  1. Demonstrate how to build your TF model: I download the quantize-aware training int8 model from repo goolge-research/mobilebert. The model download link is download link.
  2. Please follow this colab page to convert the model.
  • QAT INT8 mobilebert tensorflow model: download link. Untar the file and then you can find model is in "mobilebert_squad_savedmodels/quant_saved_model".

  • Converted INT8 tflite model: download link

3. Failure after conversion

  • Model produces wrong results: FC layers zero-point != 0. These don't follow quantization spec. image

  • Fail to convert the model to tflite, only tf-2.5.0 can successfully convert to INT8 tflite model. In other words, tf2.6 cannot work.

bhbruce avatar Sep 06 '21 10:09 bhbruce

@thaink @teijeong @daverim for the visibility.

abattery avatar Sep 06 '21 10:09 abattery

The weight input for FC op is not a weight. That's why we don't use symmetric. I think this is a corner case that TF EinsumDense convert to some TFLite FC ops because TFLite doesn't have matmul op. but it seems violate quantization spec. @teijeong Do you have any idea why it starts not working on TF 2.6+?

Xhark avatar Sep 08 '21 19:09 Xhark

@Xhark Thanks for your response! @teijeong Is there any update for this issue?

bhbruce avatar Oct 01 '21 09:10 bhbruce