model_optimization
model_optimization copied to clipboard
uint8 quantization
Issue Type
Others
Source
pip (model-compression-toolkit)
MCT Version
1.8.0
OS Platform and Distribution
Linux version 3.10.0-327.36.3.el7.x86_64 ([email protected])
Python version
3.7
Describe the issue
My model is trained with pytorch, suppose I want to use MCT's PTQ quantization method to quantize it to 8bit, and deploy the model on the edge device, how should I do it?
Thanks!
Expected behaviour
No response
Code to reproduce the issue
None
Log output
No response
Hello @1437539743, For now, MCT exports quantized models in a fakely-quantized manner (namely, the weights are quantized but have a float32 data type, and the activations are quantized using fake-quantized operations). However, we support int8 data typed in TFLite models. A usage example can be seen here. As for Pytorch models, uint8 data type may be supported in future releases. In the meantime, you can use the quantization information (number of bits, threshold, etc.) attached for each layer, using the flag new_experimental_exporter when calling pytorch_post_training_quantization_experimental.
Stale issue message