Export encodings does not work with QNN when using aimet_torch QAT
Hi, This is a great work!
When I use aimet_torch to export my encodings after quant-after-training, I got the flollowing encodings:
{
"activation_encodings": [balabala],
"excluded_layers": [],
"param_encodings":
{
"bw": 8,
"dtype": "INT",
"enc_type": "PER_CHANNEL",
"is_sym": true,
"name": "backbone.stem.0.conv.weight",
"offset": [
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128,
-128
],
"scale": [
0.0003764387220144272,
0.00042047951137647033,
1.1920928955078125e-07,
0.0005325825186446309,
0.0004614265344571322,
0.0004825303622055799,
0.0004339261504355818,
0.0005174417165108025,
0.0006007375195622444,
0.0004937442135997117,
0.0003481963649392128,
1.1920928955078125e-07,
1.1920928955078125e-07,
0.000488974794279784,
1.1920928955078125e-07,
0.00041557062650099397
]
},
.......
"quantizer_args": {
"activation_bitwidth": 8,
"dtype": "int",
"is_symmetric": true,
"param_bitwidth": 8,
"per_channel_quantization": true,
"quant_scheme": "post_training_tf_enhanced"
},
"version": "1.0.0"
However when I use with qnn-onnx-convert, it failed with:
Traceback (most recent call last):
File "/opt/qcom/aistack/qairt/2.33.0.250327/bin/x86_64-linux-clang/qnn-onnx-converter", line 78, in main
backend.save(optimized_graph)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/backend/ir_to_qnn.py", line 570, in save
ir_graph = self.get_ir_graph(graph)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/qnn_backend/qnn_backend_base.py", line 584, in get_ir_graph
raise e
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/qnn_backend/qnn_backend_base.py", line 572, in get_ir_graph
QnnTranslations.apply_method_to_all_ops(BackendTranslationBase.ADD_OP_TO_BACKEND, graph, self)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/common/converter_ir/translation.py", line 71, in apply_method_to_all_ops
self.apply_method_to_op(node.op.type, method_name, node, graph, *args, **kwargs)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/common/converter_ir/translation.py", line 51, in apply_method_to_op
return translation.apply_method(method_name, *args, **kwargs)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/common/converter_ir/translation.py", line 18, in apply_method
return self.indexed_methods[method_name](*args, **kwargs)
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/qnn_backend/qnn_translations.py", line 747, in add_op_to_backend
backend.add_node(node.op.name, conv_type,
File "/opt/qcom/aistack/qairt/2.33.0.250327/lib/python/qti/aisw/converters/backend/ir_to_qnn.py", line 286, in add_node
if not self.model.add_node(node_name, node_type, node_package_name, tensor_params, scalar_params,
RuntimeError: fillQuantInfoForPerAxis: The axis info are needed in per_channel fakequant, but missed it!\
When We change aimet_common.quantsim.encoding_version into '2.0.0', The export failed:
Traceback (most recent call last):
File "/home/xxxxxxx/qat_with_mmcv_mmdet/./qat_export.py", line 310, in export_model
sim.export(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/v2/quantsim/quantsim.py", line 549, in export
return super().export(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 948, in export
self.export_onnx_model_and_encodings(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 1090, in export_onnx_model_and_encodings
cls._export_encodings_to_files(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 1421, in _export_encodings_to_files
_export_to_1_0_0(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/experimental/v2/quantsim/export_utils.py", line 72, in _export_to_1_0_0
activation_encodings = _get_activation_encodings(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/experimental/v2/quantsim/export_utils.py", line 110, in _get_activation_encodings
assert encodings[0]["dtype"] in {"int", "float"}
KeyError: 0
ONLY when We change aimet_common.quantsim.encoding_version into '0.6.1', it works well.
So which version should be used when together with QNN? Is there any performance difference between different export versions?
Hi @charmway what aimet-torch version are you using? we recently fixed similar issue and this could be related to older QAIRT version. Coudl you please try with latest aimet-torch and QAIRT 2.41?
Hi. I am using the aimet-torch 2.18.0 and my QNN version is 2.33.0.250327.
I will try QAIRT 2.41 later.
I wonder why the export failed when I use aimet_common.quantsim.encoding_version with '2.0.0'.
Traceback (most recent call last):
File "/home/xxxxxxx/qat_with_mmcv_mmdet/./qat_export.py", line 310, in export_model
sim.export(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/v2/quantsim/quantsim.py", line 549, in export
return super().export(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 948, in export
self.export_onnx_model_and_encodings(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 1090, in export_onnx_model_and_encodings
cls._export_encodings_to_files(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/_base/quantsim.py", line 1421, in _export_encodings_to_files
_export_to_1_0_0(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/experimental/v2/quantsim/export_utils.py", line 72, in _export_to_1_0_0
activation_encodings = _get_activation_encodings(
File "/home/xxxxxxx/aimet_mm_v2/lib/python3.10/site-packages/aimet_torch/experimental/v2/quantsim/export_utils.py", line 110, in _get_activation_encodings
assert encodings[0]["dtype"] in {"int", "float"}
KeyError: 0
@quic-bhushans Hi, DiDi
@charmway Thanks for great questions 😊
I don't have much clue about the first QAIRT error, but I have two general recommendations.
- Try qairt-converter instead of qnn-onnx-convert. qairt-converter is the new, all-in-one converter that is most well maintained in the recent days.
- Upgrading your QAIRT version can help, especially if you are going to try qairt-converter
About the second issue, sorry for confusion. The error message hasn't been very helpful there. Again two points:
- sim.export doesn't support 2.0.0 encoding; you should use sim.onnx.export instead. In the latest main branch, I added a clearer error message about it. https://github.com/quic/aimet/blob/a048d5ff67ee059d77112e9c9bab54b910ee37d4/TrainingExtensions/torch/src/python/aimet_torch/v2/quantsim/quantsim.py#L494-L502
- If you are to try 2.0.0 encoding format, you MUST upgrade your QAIRT version to latest. (Using latest AIMET isn't absolutely necessary but recommended). This is because exporting encodings in 2.0.0 format is a beta feature that is still under active development. There are many improvements and bugfixes added in the last few releases without which qairt-converter is most certainly not going to work.
Thanks 💯