yolo2_light YOLO v3 INT8 inference in TensorFlow Lite

YOLO v3 INT8 inference in TensorFlow Lite

Open anferico opened this issue 5 years ago • 2 comments

Hello, Is it possible to obtain a quantized .tflite version of YOLO v3 / YOLO Tiny v3 to do INT8 inference with the tools in this repository? I've tried using TensorFlow Lite's official tool, toco, but it seems that some layers don't support quantization.

Apr 25 '19 15:04 anferico

Hi! Yes, I've obtain a Quantized .tflite

$ bazel run tensorflow/lite/toco:toco -- \
    --input_file=mymodel.pb \
    --output_file=output.tflite \
    --input_shapes=1,416,416,3 \
    --input_arrays='input_1' \
    --output_format=TFLITE \
    --output_arrays='output_0','output_1' \
    --inference_type=QUANTIZED_UINT8 \
    --std_dev_values=128 --mean_values=128 \
    --default_ranges_min=-6 --default_ranges_max=6 \
    --change_concat_input_ranges=false \
    --allow_custom_ops

BUT I don't understand how use it. I've received a RuntimeWarning: overflow encountered in exp during my post elaboration Have you some idea?

May 13 '19 15:05 ambr89

what u have done may be called "dummy quantization" - i.e. only test the tool and won't do anything abt quantization. For uint8 quantization using toco, currently, u may need to consider 'quantization-aware training' from tensorflow (google it for some insights). It inserts quantizaiton layers measuring min/max of some tensors and then simulates quantization error during training. After training, freeze the graph with checkpoint, convert it to tflite, u get it!

Aug 15 '19 10:08 kolingv

yolo2_light yolo2_light copied to clipboard

YOLO v3 INT8 inference in TensorFlow Lite

yolo2_light
yolo2_light copied to clipboard