tensorflow-yolov4-tflite Fixed int8 quantization and added experimental mixed int8/int16 quantization

Thanks for providing these examples of working with Yolo v4/v3!

I managed to fix the int8 quantization by adding a model.compile() statement to fix the "optimize global tensors" exception. I could also remove overriding supported_ops, by following the examples TensorFlow Lite provides for quantization.

FYI: I'm currently trying to port these models to the K210 / MaixPy MCU, but so far haven't managed to get nncase fully consume the tflite files yet (it doesn't support the SPLIT and DEQUANTIZE op tflite codes).

Note: This only works on the latest tf-nightly (2.4.0+). It doesn't work on tensorflow-2.3.0

It doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.

Best regards, Mikael

Sep 12 '20 18:09 mikljohansson

Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?

Sep 14 '20 23:09 JimBratsos

Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?

No, unfortunately it it doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.

When I try with

    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.target_spec.supported_types = [tf.int8, tf.uint8]
    converter.inference_input_type = tf.uint8
    converter.inference_output_type = tf.int8
    converter.representative_dataset = representative_data_gen

I get

RuntimeError: Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.

Sep 15 '20 17:09 mikljohansson

@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment

  File "convert_tflite.py", line 87, in <module>                                          
    app.run(main)                                                                                                                                                                                                                                                                                                           
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 300, in run                         
    _run_main(main, args)                                                                                                                                                                                                                                
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))                                                                                                                                                             
  File "convert_tflite.py", line 82, in main                                                                                                     
    save_tflite()                                                                                                                                                                                                                                                                           
  File "convert_tflite.py", line 56, in save_tflite                          
    tflite_model = converter.convert()                                                                                                                                            
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 892, in convert                                                                                                                                                   
    self).convert(graph_def, input_tensors, output_tensors)            
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 650, in convert
    result = self._calibrate_quantize_model(result, **flags)                                                                                
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 478, in _calibrate_quantize_model                                                    
    inference_output_type, allow_float, activations_type)                                                  
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 98, in calibrate_and_quantize
    np.dtype(activations_type.as_numpy_dtype()).num)                 
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1
Empty min/max for tensor input_1

Since this depends on tf-nightly, perhaps something has changed in the last 10 days since you made this PR? I'm using tf-nightly:2.4.0-dev20200918 and Python 3.7.0

Note the below was also in the debug log, a good ways before the backtrace information

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.                                                               
W0922 23:03:29.533921 4621229504 load.py:133] No training configuration found in save file, so the model was *not* compiled. Compile it manually.

Sep 23 '20 03:09 raryanpur

@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1
Empty min/max for tensor input_1

@raryanpur I think the problem might be perhaps that some file paths are incorrect in the calibration dataset (e.g. ./data/dataset/val2017.txt), I got this error myself and it took me a while to figure out that I had gotten the sample image paths incorrect :sweat_smile:

I've improved the error reporting for this now, if you pull and try again it might give you a better error message about what's wrong. If it turns out the dataset is missing there's instructions in the README.md about how to download it

Best, Mikael

Sep 23 '20 03:09 mikljohansson

Ah that did the trick - thanks @mikljohansson, works now!

Sep 23 '20 05:09 raryanpur

@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?

Sep 25 '20 03:09 raryanpur

Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.

And the message was shown below

[{'name': 'input_1', 'index': 549, 'shape': array([  1, 416, 416,   3], dtype=int32), 'shape_signature': array([ -1, 416, 416,   3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 550, 'shape': array([    1, 10647,     4], dtype=int32), 'shape_signature': array([ 1, -1,  4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([    1, 10647,     3], dtype=int32), 'shape_signature': array([ 1, -1,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

My question is : Why does it say int32? We actually did a int-8 quantization right? Is there any resource about it? Very appreciated. Thanks for your nice work again.

Nov 19 '20 08:11 YLTsai0609

@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?

@raryanpur sorry for not getting back, E-mail got lost in my inbox :(

I honestly don't know, sorry. I haven't dug into the input/output scaling and haven't worked on this model for a while (focusing on other things right now). Hopefully you've been able to work it out already :)

Nov 19 '20 10:11 mikljohansson

Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.

And the message was shown below

[{'name': 'input_1', 'index': 549, 'shape': array([  1, 416, 416,   3], dtype=int32), 'shape_signature': array([ -1, 416, 416,   3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 550, 'shape': array([    1, 10647,     4], dtype=int32), 'shape_signature': array([ 1, -1,  4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([    1, 10647,     3], dtype=int32), 'shape_signature': array([ 1, -1,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

My question is : Why does it say int32? We actually did a int-8 quantization right? Is there any resource about it? Very appreciated. Thanks for your nice work again.

Not sure honestly why that is. I could imagine it could be because the network doesn't quantize fully (due to the EXP operator mentioned in earlier comments on this PR).

Perhaps you could try to uncomment these lines in convert_tflite.py and see if it makes a difference?

    #converter.inference_input_type = tf.uint8
    #converter.inference_output_type = tf.int8

This flag might set all intermediate weights and calculations to 8-bit, but I don't think it'd work currently due to the inability to fully quantize the network

converter.target_spec.supported_types = [tf.int8]

Nov 19 '20 10:11 mikljohansson

after converting a customized(not coco) yolov3-tiny into .tflite format i executed the command below

python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-int8.tflite --quantize_mode int8 --dataset ./coco_dataset/coco/val207.txt

./checkpoints/yolov4-416 ---> this is not coco model, it is from customized/different dataset

Please suggest, still shall i have to use ./coco_dataset/coco/val207.txt ?
if not, how can i convert my dataset from yolo annotated format to the format of which val207.txt ?

Jan 08 '21 07:01 Arfinul

RuntimeError: Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'.

I also have the above error, @mikljohansson were you able to fix this ? if yes can you provide the solution. Thanks!

Feb 20 '21 15:02 spalani7

tensorflow-yolov4-tflite tensorflow-yolov4-tflite copied to clipboard

Fixed int8 quantization and added experimental mixed int8/int16 quantization

tensorflow-yolov4-tflite
tensorflow-yolov4-tflite copied to clipboard