yolo2_light
yolo2_light copied to clipboard
YOLO v3 INT8 inference in TensorFlow Lite
Hello,
Is it possible to obtain a quantized .tflite version of YOLO v3 / YOLO Tiny v3 to do INT8 inference with the tools in this repository? I've tried using TensorFlow Lite's official tool, toco
, but it seems that some layers don't support quantization.
Hi! Yes, I've obtain a Quantized .tflite
$ bazel run tensorflow/lite/toco:toco -- \
--input_file=mymodel.pb \
--output_file=output.tflite \
--input_shapes=1,416,416,3 \
--input_arrays='input_1' \
--output_format=TFLITE \
--output_arrays='output_0','output_1' \
--inference_type=QUANTIZED_UINT8 \
--std_dev_values=128 --mean_values=128 \
--default_ranges_min=-6 --default_ranges_max=6 \
--change_concat_input_ranges=false \
--allow_custom_ops
BUT I don't understand how use it.
I've received a
RuntimeWarning: overflow encountered in exp
during my post elaboration
Have you some idea?
what u have done may be called "dummy quantization" - i.e. only test the tool and won't do anything abt quantization. For uint8 quantization using toco, currently, u may need to consider 'quantization-aware training' from tensorflow (google it for some insights). It inserts quantizaiton layers measuring min/max of some tensors and then simulates quantization error during training. After training, freeze the graph with checkpoint, convert it to tflite, u get it!