yolov5
yolov5 copied to clipboard
Quantization using Onnx / TfLite
Search before asking
- [X] I have searched the YOLOv5 issues and found no similar bug report.
YOLOv5 Component
No response
Bug
Hello, I previously openned an issue regarding pytorch quantization, and it appears that a prerequisite for using pytorch in this context is an in-depth understanding of the model's architecture. Consequently, I switched to other quantization methods
The initial approach is using using onnx runtime, the quantization works but it only when I exclude the nodes of the last layer : nodes_to_exclude=['/model.24/Concat_3','/model.24/Reshape_1','/model.24/Reshape_3','/model.24/Reshape_5','/model.24/Concat_2','/model.24/Concat_1','/model.24/Concat','/model.24/Mul','/model.24/Mul_1','/model.24/Mul_2','/model.24/Mul_3','/model.24/Mul_4','/model.24/Mul_5','/model.24/Mul_6','/model.24/Mul_7','/model.24/Mul_8','/model.24/Mul_9','/model.24/Mul_10','/model.24/Mul_11','/model.24/Add','/model.24/Add_1','/model.24/Add_2','/model.24/Pow','/model.24/Pow_1','/model.24/Pow_2'],
Therefore my first question is : is there anything I can do to fully quantize the model without excluding those nodes ?
In the mean time I'm also quantizating using tflite full int8 quantization, here's the code I am using :
import tensorflow as tf saved_model_dir ='path2_saved_model' converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_dataset converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.inference_input_type = tf.int8 # or tf.uint8 converter.inference_output_type = tf.int8 # or tf.uint8 tflite_quant_model = converter.convert()
however when I try to run detect on the tflite model, I get the following error :
It seems like it's expecting the input image to be int8, how can I solve this issue?
Thank you in advance !!
Environment
No response
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?
- [ ] Yes I'd like to help by submitting a PR!
@katia-katkat hello!
Quantization can indeed be a bit tricky, especially when it comes to preserving the accuracy of your model. For ONNX, fully quantizing the model without excluding nodes might require custom quantization approaches or fine-tuning the quantized model to regain accuracy. Unfortunately, there isn't a one-size-fits-all solution, and it often involves a bit of trial and error.
Regarding the TFLite issue, the error you're encountering suggests that the input data type expected by the model doesn't match the data type of the input you're providing. If your model is expecting int8
inputs, you'll need to ensure that your input data is properly quantized to int8
before passing it to the model. This typically involves normalizing the image data and then scaling it to the int8
range.
For more detailed guidance on quantization and handling data types, please refer to our documentation at https://docs.ultralytics.com/yolov5/. If you continue to face issues, consider opening a discussion in the repo so that the community can provide more targeted advice.
Best of luck with your quantization efforts! 🚀
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐