Cannot perform quantization-aware training with the object detection API
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [ x] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [x ] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [x ] I checked to make sure that this issue has not already been filed.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/research/object_detection/model_main_tf2.py
2. Describe the bug
My pipeline.config includes the following lines:
graph_rewriter {
quantization {
delay: 0
weight_bits: 8
activation_bits: 8
}
}
However, the quantization seems not to be performed. I tried to export the model with object_detection/export_tflite_graph_tf2.py and then to convert it to tflite (see code below), but when inspecting the network with netron I noticed that the layers inputs are still float32.
3. Steps to reproduce
Steps to reproduce the behavior.
object_detection/model_main_tf2.py (standard pipeline.config for mobilenetv2ssd + the lines specified above)
object_detection/export_tflite_graph_tf2.py
Following code for tflite conversion:
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_Saved_model("my_path")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
q_model = converter.convert()
open("q_model.tflite", "wb").write(q_model)
Adding the line converter.target_spec.supported_types = [tf.uint8] doesn't change anything
Adding the line converter.input_inference_type = tf.uint8 raises an error (the type muse be float32)
4. Expected behavior
I expect the converted model to be fully quantized
5. Additional context
Maybe it is useful to understand the problem: the first checkpoint weights 19MB while second weights 39MB (note that I put delay:0 for quantization)
6. System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ubuntu 18.04
- Mobile device name if the issue happens on a mobile device:
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 2.4.1
- Python version: 3.6.9
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 11.2
- GPU model and memory: Titan xp, 12GB