peterjc123 comments

Results 139 comments of


                                            peterjc123

QAT cause conversion error

@Raychen0617 So my question is that is the quantization aware training performing well? If it is and the difference is still too large, would you please provide your model (the...

QAT cause conversion error

> QAT can run without error I mean what about the accuracy of the model? Does that drop a lot compared to the floating one?

QAT cause conversion error

The way to do validation is not right I guess. It is pointless to run the model with weights that is used in quantization-aware training. > However, when using Tinynn...

[converter] TFLite schema update

PRs & Commits: - TFLite schema updates https://github.com/alibaba/TinyNeuralNetwork/pull/57 - aten::cumsum https://github.com/alibaba/TinyNeuralNetwork/commit/0e94eccfa5061d27a33ed2fcbee1b111f8f7fd34 - conv3d, conv_transpose3d https://github.com/alibaba/TinyNeuralNetwork/commit/468801a4179506109bebbceb67e7b13af11020bb

A PTQ tflite model fails to pass benchmark test

The following pattern in your model is the root cause of the problem. ``` A = sigmoid(X) B = cat(A, Y) ``` The output tensor of the `sigmoid` op has...

A PTQ tflite model fails to pass benchmark test

Or you may just skip the quantization for this kind of pattern, which seems to be the simplest solution.

outputs are different between a QAT tflite and corresponding de-quantized onnx model

@liamsun2019 As we all know, quantization is not lossless. I think it's pointless to perform such kind of comparison. There will certainly be some differences between the results of the...

outputs are different between a QAT tflite and corresponding de-quantized onnx model

@liamsun2019 Are you sure you use the following config for the quantizer? ```py quantizer = QATQuantizer(model, dummy_input, config={'asymmetric': True, 'per_tensor': False, ...}) ```

outputs are different between a QAT tflite and corresponding de-quantized onnx model

@liamsun2019 One thing is weird in your model. Actually I've set quant_min, quant_max as -127 and 127, but you can still see -128 in the weights.

outputs are different between a QAT tflite and corresponding de-quantized onnx model

Looks like we could not set quant_min and quant_max this way, the observer has its own logic for re-calculating them. https://github.com/pytorch/pytorch/blob/4a8d4cde6589178e989db89d576108ba6d3e6e9a/torch/ao/quantization/utils.py#L192