automl
automl copied to clipboard
QAT model cannot perform inference
After training an efficientdet-lite0 or efficientdet-d0 model with quantization aware training (using keras.train --hparams="model_optimizations.quantize={}" ...
), both the saved model and exported tflite model achieve 0 mAP when evaluated with keras.eval or keras.eval_tflite, even though the model had non-zero mAP during training/eval.
If I export an FP16 or FP32 tflite model using keras.inspector --mode=export --hparams="model_optimizations.quantize={}" ...
, then the models achieve good mAP, but there are extra quantize and dequantize nodes in the graph.
If I export an INT8 tflite model using keras.inspector --mode=export --hparams="model_optimizations.quantize={}" ...
, then the model exports. But when I run keras.eval_tflite
, it errors with RuntimeError: tensorflow/lite/kernels/dequantize.cc:61 op_context.input->type == kTfLiteUInt8 || op_context.input->type == kTfLiteInt8 || op_context.input->type == kTfLiteInt16 || op_context.input->type == kTfLiteFloat16 was not true.Node number 237 (DEQUANTIZE) failed to prepare.
because of issues caused from the extra quantize and dequantize nodes in the graph. After a conv layer, the nodes end up looking like conv -> quantize -> dequantize -> dequantize -> quantize -> mul
, with the quantize and dequantize nodes being unnecessary because the conv and mul nodes are int8 anyways.