nncf icon indicating copy to clipboard operation
nncf copied to clipboard

Reenable scale unification

Open vshampor opened this issue 1 year ago • 7 comments

Changes

Restored the original, pre-#1778 logic of scale unification and added missing tests for the logic. Added BatchNorm as a quantizable operation (asymmetric, per-channel) to the CPU HW config to handle cases like densenet where batch norm is the first operation in a branch.

Reason for changes

Scales are currently not correctly unified in cases such as #2195.

Related tickets

N/A

Tests

tests/common/quantization/test_quantizer_setup.py tests/**/quantization/test_unified_scales.py

Fixes: #2195

vshampor avatar Oct 13 '23 15:10 vshampor

I put this out of the draft state to trigger the CI runs first.

vshampor avatar Oct 17 '23 15:10 vshampor

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (6539272) 90.82% compared to head (2ef4513) 84.56%. Report is 6 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2199      +/-   ##
===========================================
- Coverage    90.82%   84.56%   -6.27%     
===========================================
  Files          498      498              
  Lines        45485    45482       -3     
===========================================
- Hits         41314    38464    -2850     
- Misses        4171     7018    +2847     
Files Coverage Δ
nncf/common/hardware/opset.py 100.00% <100.00%> (ø)
...ommon/quantization/quantizer_propagation/solver.py 93.89% <ø> (+0.12%) :arrow_up:
nncf/onnx/graph/metatypes/onnx_metatypes.py 99.58% <100.00%> (+<0.01%) :arrow_up:
...ncf/openvino/graph/metatypes/openvino_metatypes.py 90.71% <100.00%> (-8.72%) :arrow_down:
nncf/quantization/algorithms/min_max/algorithm.py 94.84% <ø> (-2.35%) :arrow_down:
nncf/quantization/algorithms/min_max/backend.py 100.00% <ø> (ø)
...cf/quantization/algorithms/min_max/onnx_backend.py 94.87% <ø> (-0.10%) :arrow_down:
...uantization/algorithms/min_max/openvino_backend.py 0.00% <ø> (-96.43%) :arrow_down:
...f/quantization/algorithms/min_max/torch_backend.py 97.38% <ø> (-0.05%) :arrow_down:
nncf/tensorflow/graph/metatypes/keras_layers.py 96.74% <100.00%> (+<0.01%) :arrow_up:
... and 5 more

... and 54 files with indirect coverage changes

Flag Coverage Δ
COMMON 43.26% <80.95%> (+1.03%) :arrow_up:
ONNX 34.72% <56.00%> (-0.02%) :arrow_down:
OPENVINO ∅ <ø> (∅)
TENSORFLOW 29.67% <56.00%> (-0.01%) :arrow_down:
TORCH 65.96% <68.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 92.74% <85.71%> (-0.59%) :arrow_down:
torch 93.41% <100.00%> (+<0.01%) :arrow_up:
tensorflow 94.05% <100.00%> (+0.08%) :arrow_up:
onnx 93.04% <100.00%> (-0.01%) :arrow_down:
openvino 25.65% <100.00%> (-68.42%) :arrow_down:
ptq 67.18% <ø> (-20.48%) :arrow_down:

codecov[bot] avatar Oct 20 '23 12:10 codecov[bot]

ONNX E2E 489 shows accuracy improvement for densenet12 (+0.77% acc.) and resnet50-v2-7 (+0.1% acc) and no difference for other models.

TF E2E 490 shows no significant changes to the regular nightly run.

PTQ build 240 shows accuracy degradation of 0.1% on timm/dpn68 and of 0.2% timm/visformer_small - both are for torch backend only, and for both the INT8 metric for the model is still within 99% FP32.

vshampor avatar Nov 08 '23 09:11 vshampor

@KodiaqQ

General questions:

  1. How does scale unification work in this PR? Based on the code, I assume that all possible layers unify now. Is that correct?
  2. Why were most Tensorflow models updated in tests? What about other backends?
  3. Were any conformance tests run on the final version? Are there any degradations?
  1. See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.
  2. I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.
  3. See comment above ("PTQ build 240").

vshampor avatar Jan 29 '24 10:01 vshampor

  1. See the tests. The activation scales are unified based on the marker in HW config, and does not take into account metatypes.
  2. I think your original PR with the changes in the same place also lead to TF model updates, so there should be no surprise here.
  3. See comment above ("PTQ build 240").

Thanks! Let's run the validation again (with the latest changes in NNCF & OV) and verify the results. Then merge.

KodiaqQ avatar Jan 29 '24 14:01 KodiaqQ

@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.

vshampor avatar Feb 15 '24 19:02 vshampor

@KodiaqQ PTQ build 285 shows a single regression vs build 286 on timm/visformer_small. Overall metric is still within 99% FP32.

If the PTQ build is red, then we need to update the reference for the timm/visformer_small model as well. We can do it in this PR or follow-up, doesn't matter. But we should do it to keep builds green. Also, is there any observation of why this model shows lower accuracy than the OV and ONNX versions?

KodiaqQ avatar Feb 16 '24 08:02 KodiaqQ