nncf
nncf copied to clipboard
Reenable scale unification
Changes
Restored the original, pre-#1778 logic of scale unification and added missing tests for the logic. Added BatchNorm
as a quantizable operation (asymmetric, per-channel) to the CPU HW config to handle cases like densenet where batch norm is the first operation in a branch.
Reason for changes
Scales are currently not correctly unified in cases such as #2195.
Related tickets
N/A
Tests
tests/common/quantization/test_quantizer_setup.py tests/**/quantization/test_unified_scales.py
Fixes: #2195
I put this out of the draft state to trigger the CI runs first.
Codecov Report
Attention: 3 lines
in your changes are missing coverage. Please review.
Comparison is base (
6539272
) 90.82% compared to head (2ef4513
) 84.56%. Report is 6 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #2199 +/- ##
===========================================
- Coverage 90.82% 84.56% -6.27%
===========================================
Files 498 498
Lines 45485 45482 -3
===========================================
- Hits 41314 38464 -2850
- Misses 4171 7018 +2847
Files | Coverage Δ | |
---|---|---|
nncf/common/hardware/opset.py | 100.00% <100.00%> (ø) |
|
...ommon/quantization/quantizer_propagation/solver.py | 93.89% <ø> (+0.12%) |
:arrow_up: |
nncf/onnx/graph/metatypes/onnx_metatypes.py | 99.58% <100.00%> (+<0.01%) |
:arrow_up: |
...ncf/openvino/graph/metatypes/openvino_metatypes.py | 90.71% <100.00%> (-8.72%) |
:arrow_down: |
nncf/quantization/algorithms/min_max/algorithm.py | 94.84% <ø> (-2.35%) |
:arrow_down: |
nncf/quantization/algorithms/min_max/backend.py | 100.00% <ø> (ø) |
|
...cf/quantization/algorithms/min_max/onnx_backend.py | 94.87% <ø> (-0.10%) |
:arrow_down: |
...uantization/algorithms/min_max/openvino_backend.py | 0.00% <ø> (-96.43%) |
:arrow_down: |
...f/quantization/algorithms/min_max/torch_backend.py | 97.38% <ø> (-0.05%) |
:arrow_down: |
nncf/tensorflow/graph/metatypes/keras_layers.py | 96.74% <100.00%> (+<0.01%) |
:arrow_up: |
... and 5 more |
... and 54 files with indirect coverage changes
Flag | Coverage Δ | |
---|---|---|
COMMON | 43.26% <80.95%> (+1.03%) |
:arrow_up: |
ONNX | 34.72% <56.00%> (-0.02%) |
:arrow_down: |
OPENVINO | ∅ <ø> (∅) |
|
TENSORFLOW | 29.67% <56.00%> (-0.01%) |
:arrow_down: |
TORCH | 65.96% <68.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Components | Coverage Δ | |
---|---|---|
common | 92.74% <85.71%> (-0.59%) |
:arrow_down: |
torch | 93.41% <100.00%> (+<0.01%) |
:arrow_up: |
tensorflow | 94.05% <100.00%> (+0.08%) |
:arrow_up: |
onnx | 93.04% <100.00%> (-0.01%) |
:arrow_down: |
openvino | 25.65% <100.00%> (-68.42%) |
:arrow_down: |
ptq | 67.18% <ø> (-20.48%) |
:arrow_down: |
ONNX E2E 489 shows accuracy improvement for densenet12 (+0.77% acc.) and resnet50-v2-7 (+0.1% acc) and no difference for other models.
TF E2E 490 shows no significant changes to the regular nightly run.
PTQ build 240 shows accuracy degradation of 0.1% on timm/dpn68 and of 0.2% timm/visformer_small - both are for torch backend only, and for both the INT8 metric for the model is still within 99% FP32.
@KodiaqQ
General questions:
- How does scale unification work in this PR? Based on the code, I assume that all possible layers unify now. Is that correct?
- Why were most Tensorflow models updated in tests? What about other backends?
- Were any conformance tests run on the final version? Are there any degradations?
- See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.
- I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.
- See comment above ("PTQ build 240").
- See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.
- I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.
- See comment above ("PTQ build 240").
Thanks! Let's run the validation again (with the latest changes in NNCF & OV) and verify the results. Then merge.
@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.
@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.
If the PTQ build is red, then we need to update the reference for the timm/visformer_small model as well. We can do it in this PR or follow-up, doesn't matter. But we should do it to keep builds green. Also, is there any observation of why this model shows lower accuracy than the OV and ONNX versions?