model_optimization icon indicating copy to clipboard operation
model_optimization copied to clipboard

uint8 quantization

Open zhangchen98 opened this issue 1 year ago • 2 comments

Issue Type

Others

Source

pip (model-compression-toolkit)

MCT Version

1.8.0

OS Platform and Distribution

Linux version 3.10.0-327.36.3.el7.x86_64 ([email protected])

Python version

3.7

Describe the issue

My model is trained with pytorch, suppose I want to use MCT's PTQ quantization method to quantize it to 8bit, and deploy the model on the edge device, how should I do it?

Thanks!

Expected behaviour

No response

Code to reproduce the issue

None

Log output

No response

zhangchen98 avatar Apr 20 '23 13:04 zhangchen98

Hello @1437539743, For now, MCT exports quantized models in a fakely-quantized manner (namely, the weights are quantized but have a float32 data type, and the activations are quantized using fake-quantized operations). However, we support int8 data typed in TFLite models. A usage example can be seen here. As for Pytorch models, uint8 data type may be supported in future releases. In the meantime, you can use the quantization information (number of bits, threshold, etc.) attached for each layer, using the flag new_experimental_exporter when calling pytorch_post_training_quantization_experimental.

reuvenperetz avatar Apr 27 '23 07:04 reuvenperetz

Stale issue message

github-actions[bot] avatar Jul 10 '23 10:07 github-actions[bot]