yolov5 Quantization using Onnx / TfLite

Search before asking

[X] I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

No response

Bug

Hello, I previously openned an issue regarding pytorch quantization, and it appears that a prerequisite for using pytorch in this context is an in-depth understanding of the model's architecture. Consequently, I switched to other quantization methods

The initial approach is using using onnx runtime, the quantization works but it only when I exclude the nodes of the last layer : nodes_to_exclude=['/model.24/Concat_3','/model.24/Reshape_1','/model.24/Reshape_3','/model.24/Reshape_5','/model.24/Concat_2','/model.24/Concat_1','/model.24/Concat','/model.24/Mul','/model.24/Mul_1','/model.24/Mul_2','/model.24/Mul_3','/model.24/Mul_4','/model.24/Mul_5','/model.24/Mul_6','/model.24/Mul_7','/model.24/Mul_8','/model.24/Mul_9','/model.24/Mul_10','/model.24/Mul_11','/model.24/Add','/model.24/Add_1','/model.24/Add_2','/model.24/Pow','/model.24/Pow_1','/model.24/Pow_2'],

Therefore my first question is : is there anything I can do to fully quantize the model without excluding those nodes ?

In the mean time I'm also quantizating using tflite full int8 quantization, here's the code I am using :

import tensorflow as tf saved_model_dir ='path2_saved_model' converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_dataset converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.inference_input_type = tf.int8 # or tf.uint8 converter.inference_output_type = tf.int8 # or tf.uint8 tflite_quant_model = converter.convert()

however when I try to run detect on the tflite model, I get the following error :

It seems like it's expecting the input image to be int8, how can I solve this issue?

Thank you in advance !!

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

Feb 04 '24 11:02 katia-katkat

@katia-katkat hello!

Quantization can indeed be a bit tricky, especially when it comes to preserving the accuracy of your model. For ONNX, fully quantizing the model without excluding nodes might require custom quantization approaches or fine-tuning the quantized model to regain accuracy. Unfortunately, there isn't a one-size-fits-all solution, and it often involves a bit of trial and error.

Regarding the TFLite issue, the error you're encountering suggests that the input data type expected by the model doesn't match the data type of the input you're providing. If your model is expecting int8 inputs, you'll need to ensure that your input data is properly quantized to int8 before passing it to the model. This typically involves normalizing the image data and then scaling it to the int8 range.

For more detailed guidance on quantization and handling data types, please refer to our documentation at https://docs.ultralytics.com/yolov5/. If you continue to face issues, consider opening a discussion in the repo so that the community can provide more targeted advice.

Best of luck with your quantization efforts! 🚀

Feb 04 '24 17:02 glenn-jocher

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

Mar 06 '24 00:03 github-actions[bot]

yolov5 yolov5 copied to clipboard

Quantization using Onnx / TfLite

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

yolov5
yolov5 copied to clipboard