tvm
tvm copied to clipboard
[QNN][ONNX-Frontent] Fix error of reading zero_point during per-channel quantization
Quantification is usually divided into two modes: per-channel or per-tensor. For per-channel zero_point and scale are 1d array and it's length same as tensor channel. For per-tensor zero_point and scale are scalar. At present, when using onnx-qnn in the frontend, tvm seems to only consider per-sensor quantization. Additionally, for both quantization modes, read zero_point can be used in this way without causing errors.