tensorflow-yolov4-tflite
tensorflow-yolov4-tflite copied to clipboard
Fixed int8 quantization and added experimental mixed int8/int16 quantization
Thanks for providing these examples of working with Yolo v4/v3!
I managed to fix the int8 quantization by adding a model.compile()
statement to fix the "optimize global tensors" exception. I could also remove overriding supported_ops
, by following the examples TensorFlow Lite provides for quantization.
FYI: I'm currently trying to port these models to the K210 / MaixPy MCU, but so far haven't managed to get nncase fully consume the tflite files yet (it doesn't support the SPLIT
and DEQUANTIZE
op tflite codes).
Note: This only works on the latest tf-nightly
(2.4.0+). It doesn't work on tensorflow-2.3.0
It doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.
Best regards, Mikael
Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?
Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?
No, unfortunately it it doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.
When I try with
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8, tf.uint8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.int8
converter.representative_dataset = representative_data_gen
I get
RuntimeError: Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment
File "convert_tflite.py", line 87, in <module>
app.run(main)
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "convert_tflite.py", line 82, in main
save_tflite()
File "convert_tflite.py", line 56, in save_tflite
tflite_model = converter.convert()
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 892, in convert
self).convert(graph_def, input_tensors, output_tensors)
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 650, in convert
result = self._calibrate_quantize_model(result, **flags)
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 478, in _calibrate_quantize_model
inference_output_type, allow_float, activations_type)
File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 98, in calibrate_and_quantize
np.dtype(activations_type.as_numpy_dtype()).num)
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1
Empty min/max for tensor input_1
Since this depends on tf-nightly
, perhaps something has changed in the last 10 days since you made this PR? I'm using tf-nightly
:2.4.0-dev20200918
and Python 3.7.0
Note the below was also in the debug log, a good ways before the backtrace information
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
W0922 23:03:29.533921 4621229504 load.py:133] No training configuration found in save file, so the model was *not* compiled. Compile it manually.
@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1 Empty min/max for tensor input_1
@raryanpur I think the problem might be perhaps that some file paths are incorrect in the calibration dataset (e.g. ./data/dataset/val2017.txt
), I got this error myself and it took me a while to figure out that I had gotten the sample image paths incorrect :sweat_smile:
I've improved the error reporting for this now, if you pull and try again it might give you a better error message about what's wrong. If it turns out the dataset is missing there's instructions in the README.md about how to download it
Best, Mikael
Ah that did the trick - thanks @mikljohansson, works now!
@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?
Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.
And the message was shown below
[{'name': 'input_1', 'index': 549, 'shape': array([ 1, 416, 416, 3], dtype=int32), 'shape_signature': array([ -1, 416, 416, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 550, 'shape': array([ 1, 10647, 4], dtype=int32), 'shape_signature': array([ 1, -1, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([ 1, 10647, 3], dtype=int32), 'shape_signature': array([ 1, -1, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
My question is : Why does it say int32
? We actually did a int-8 quantization right?
Is there any resource about it?
Very appreciated. Thanks for your nice work again.
@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?
@raryanpur sorry for not getting back, E-mail got lost in my inbox :(
I honestly don't know, sorry. I haven't dug into the input/output scaling and haven't worked on this model for a while (focusing on other things right now). Hopefully you've been able to work it out already :)
Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.
And the message was shown below
[{'name': 'input_1', 'index': 549, 'shape': array([ 1, 416, 416, 3], dtype=int32), 'shape_signature': array([ -1, 416, 416, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] [{'name': 'Identity', 'index': 550, 'shape': array([ 1, 10647, 4], dtype=int32), 'shape_signature': array([ 1, -1, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([ 1, 10647, 3], dtype=int32), 'shape_signature': array([ 1, -1, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
My question is : Why does it say
int32
? We actually did a int-8 quantization right? Is there any resource about it? Very appreciated. Thanks for your nice work again.
Not sure honestly why that is. I could imagine it could be because the network doesn't quantize fully (due to the EXP
operator mentioned in earlier comments on this PR).
Perhaps you could try to uncomment these lines in convert_tflite.py
and see if it makes a difference?
#converter.inference_input_type = tf.uint8
#converter.inference_output_type = tf.int8
This flag might set all intermediate weights and calculations to 8-bit, but I don't think it'd work currently due to the inability to fully quantize the network
converter.target_spec.supported_types = [tf.int8]
after converting a customized(not coco) yolov3-tiny into .tflite format i executed the command below
python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-int8.tflite --quantize_mode int8 --dataset ./coco_dataset/coco/val207.txt
./checkpoints/yolov4-416 ---> this is not coco model, it is from customized/different dataset
- Please suggest, still shall i have to use ./coco_dataset/coco/val207.txt ?
- if not, how can i convert my dataset from yolo annotated format to the format of which val207.txt ?
RuntimeError: Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'. Quantization not yet supported for op: 'EXP'.
I also have the above error, @mikljohansson were you able to fix this ? if yes can you provide the solution. Thanks!