peterjc123
peterjc123
@Raychen0617 So my question is that is the quantization aware training performing well? If it is and the difference is still too large, would you please provide your model (the...
> QAT can run without error I mean what about the accuracy of the model? Does that drop a lot compared to the floating one?
The way to do validation is not right I guess. It is pointless to run the model with weights that is used in quantization-aware training. > However, when using Tinynn...
PRs & Commits: - TFLite schema updates https://github.com/alibaba/TinyNeuralNetwork/pull/57 - aten::cumsum https://github.com/alibaba/TinyNeuralNetwork/commit/0e94eccfa5061d27a33ed2fcbee1b111f8f7fd34 - conv3d, conv_transpose3d https://github.com/alibaba/TinyNeuralNetwork/commit/468801a4179506109bebbceb67e7b13af11020bb
The following pattern in your model is the root cause of the problem. ``` A = sigmoid(X) B = cat(A, Y) ``` The output tensor of the `sigmoid` op has...
Or you may just skip the quantization for this kind of pattern, which seems to be the simplest solution.
@liamsun2019 As we all know, quantization is not lossless. I think it's pointless to perform such kind of comparison. There will certainly be some differences between the results of the...
@liamsun2019 Are you sure you use the following config for the quantizer? ```py quantizer = QATQuantizer(model, dummy_input, config={'asymmetric': True, 'per_tensor': False, ...}) ```
@liamsun2019 One thing is weird in your model. Actually I've set quant_min, quant_max as -127 and 127, but you can still see -128 in the weights.
Looks like we could not set quant_min and quant_max this way, the observer has its own logic for re-calculating them. https://github.com/pytorch/pytorch/blob/4a8d4cde6589178e989db89d576108ba6d3e6e9a/torch/ao/quantization/utils.py#L192