yolo2_light icon indicating copy to clipboard operation
yolo2_light copied to clipboard

YOLO v3 INT8 inference in TensorFlow Lite

Open anferico opened this issue 5 years ago • 2 comments

Hello, Is it possible to obtain a quantized .tflite version of YOLO v3 / YOLO Tiny v3 to do INT8 inference with the tools in this repository? I've tried using TensorFlow Lite's official tool, toco, but it seems that some layers don't support quantization.

anferico avatar Apr 25 '19 15:04 anferico

Hi! Yes, I've obtain a Quantized .tflite

$ bazel run tensorflow/lite/toco:toco -- \
    --input_file=mymodel.pb \
    --output_file=output.tflite \
    --input_shapes=1,416,416,3 \
    --input_arrays='input_1' \
    --output_format=TFLITE \
    --output_arrays='output_0','output_1' \
    --inference_type=QUANTIZED_UINT8 \
    --std_dev_values=128 --mean_values=128 \
    --default_ranges_min=-6 --default_ranges_max=6 \
    --change_concat_input_ranges=false \
    --allow_custom_ops

BUT I don't understand how use it. I've received a RuntimeWarning: overflow encountered in exp during my post elaboration Have you some idea?

ambr89 avatar May 13 '19 15:05 ambr89

what u have done may be called "dummy quantization" - i.e. only test the tool and won't do anything abt quantization. For uint8 quantization using toco, currently, u may need to consider 'quantization-aware training' from tensorflow (google it for some insights). It inserts quantizaiton layers measuring min/max of some tensors and then simulates quantization error during training. After training, freeze the graph with checkpoint, convert it to tflite, u get it!

kolingv avatar Aug 15 '19 10:08 kolingv