nncf Reenable scale unification

Changes

Restored the original, pre-#1778 logic of scale unification and added missing tests for the logic. Added BatchNorm as a quantizable operation (asymmetric, per-channel) to the CPU HW config to handle cases like densenet where batch norm is the first operation in a branch.

Reason for changes

Scales are currently not correctly unified in cases such as #2195.

Related tickets

N/A

Tests

tests/common/quantization/test_quantizer_setup.py tests/**/quantization/test_unified_scales.py

Fixes: #2195

Oct 13 '23 15:10 vshampor

I put this out of the draft state to trigger the CI runs first.

Oct 17 '23 15:10 vshampor

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (6539272) 90.82% compared to head (2ef4513) 84.56%. Report is 6 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2199      +/-   ##
===========================================
- Coverage    90.82%   84.56%   -6.27%     
===========================================
  Files          498      498              
  Lines        45485    45482       -3     
===========================================
- Hits         41314    38464    -2850     
- Misses        4171     7018    +2847

Files	Coverage Δ
nncf/common/hardware/opset.py	`100.00% <100.00%> (ø)`
...ommon/quantization/quantizer_propagation/solver.py	`93.89% <ø> (+0.12%)`	:arrow_up:
nncf/onnx/graph/metatypes/onnx_metatypes.py	`99.58% <100.00%> (+<0.01%)`	:arrow_up:
...ncf/openvino/graph/metatypes/openvino_metatypes.py	`90.71% <100.00%> (-8.72%)`	:arrow_down:
nncf/quantization/algorithms/min_max/algorithm.py	`94.84% <ø> (-2.35%)`	:arrow_down:
nncf/quantization/algorithms/min_max/backend.py	`100.00% <ø> (ø)`
...cf/quantization/algorithms/min_max/onnx_backend.py	`94.87% <ø> (-0.10%)`	:arrow_down:
...uantization/algorithms/min_max/openvino_backend.py	`0.00% <ø> (-96.43%)`	:arrow_down:
...f/quantization/algorithms/min_max/torch_backend.py	`97.38% <ø> (-0.05%)`	:arrow_down:
nncf/tensorflow/graph/metatypes/keras_layers.py	`96.74% <100.00%> (+<0.01%)`	:arrow_up:
... and 5 more

... and 54 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`43.26% <80.95%> (+1.03%)`	:arrow_up:
ONNX	`34.72% <56.00%> (-0.02%)`	:arrow_down:
OPENVINO	`∅ <ø> (∅)`
TENSORFLOW	`29.67% <56.00%> (-0.01%)`	:arrow_down:
TORCH	`65.96% <68.00%> (+<0.01%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`92.74% <85.71%> (-0.59%)`	:arrow_down:
torch	`93.41% <100.00%> (+<0.01%)`	:arrow_up:
tensorflow	`94.05% <100.00%> (+0.08%)`	:arrow_up:
onnx	`93.04% <100.00%> (-0.01%)`	:arrow_down:
openvino	`25.65% <100.00%> (-68.42%)`	:arrow_down:
ptq	`67.18% <ø> (-20.48%)`	:arrow_down:

Oct 20 '23 12:10 codecov[bot]

ONNX E2E 489 shows accuracy improvement for densenet12 (+0.77% acc.) and resnet50-v2-7 (+0.1% acc) and no difference for other models.

TF E2E 490 shows no significant changes to the regular nightly run.

PTQ build 240 shows accuracy degradation of 0.1% on timm/dpn68 and of 0.2% timm/visformer_small - both are for torch backend only, and for both the INT8 metric for the model is still within 99% FP32.

Nov 08 '23 09:11 vshampor

@KodiaqQ

General questions:

How does scale unification work in this PR? Based on the code, I assume that all possible layers unify now. Is that correct?

Why were most Tensorflow models updated in tests? What about other backends?

Were any conformance tests run on the final version? Are there any degradations?

See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.
I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.
See comment above ("PTQ build 240").

Jan 29 '24 10:01 vshampor

See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.

I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.

See comment above ("PTQ build 240").

Thanks! Let's run the validation again (with the latest changes in NNCF & OV) and verify the results. Then merge.

Jan 29 '24 14:01 KodiaqQ

@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.

Feb 15 '24 19:02 vshampor

@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.

If the PTQ build is red, then we need to update the reference for the timm/visformer_small model as well. We can do it in this PR or follow-up, doesn't matter. But we should do it to keep builds green. Also, is there any observation of why this model shows lower accuracy than the OV and ONNX versions?

Feb 16 '24 08:02 KodiaqQ

nncf nncf copied to clipboard

Reenable scale unification

Changes

Reason for changes

Related tickets

Tests

Codecov Report

nncf
nncf copied to clipboard