nncf Align NPU to CPU

Changes

Aligned NPU configuration to CPU for 8bit quantization;

Reason for changes

NPU supports CPU configuration;
CPU configuration as the base;
Removed unused configurations (ConvolutionBackpropData, GroupConvolutionBackpropData);

Related tickets

132512

Tests

TBD

Mar 08 '24 11:03 nikita-malininn

Codecov Report

Attention: Patch coverage is 8.33333% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 29.94%. Comparing base (ac15a65) to head (e5cfe06). Report is 1 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #2560       +/-   ##
============================================
- Coverage    62.07%   29.94%   -32.13%     
============================================
  Files          494      494               
  Lines        45775    45791       +16     
============================================
- Hits         28413    13714    -14699     
- Misses       17362    32077    +14715

Files	Coverage Δ
nncf/common/utils/dot_file_rw.py	`58.33% <0.00%> (-31.95%)`	:arrow_down:
nncf/quantization/algorithms/min_max/algorithm.py	`20.17% <9.09%> (-76.10%)`	:arrow_down:

... and 262 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`?`
ONNX	`?`
OPENVINO	`?`
TENSORFLOW	`29.94% <8.33%> (-0.01%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`76.35% <0.00%> (-12.08%)`	:arrow_down:
torch	`0.01% <ø> (-32.85%)`	:arrow_down:
tensorflow	`93.74% <ø> (ø)`
onnx	`0.00% <ø> (-93.07%)`	:arrow_down:
openvino	`0.00% <ø> (-94.23%)`	:arrow_down:
ptq	`15.24% <9.09%> (-65.68%)`	:arrow_down:

Mar 08 '24 11:03 codecov[bot]

Linters fails due to https://github.com/openvinotoolkit/nncf/pull/2561

Mar 08 '24 11:03 nikita-malininn

Should NPU config also contain this?

            "attributes": {
                "scales": "unified"
            },

Mar 11 '24 12:03 ljaljushkin

Also noticed this difference

Mar 11 '24 12:03 ljaljushkin

Should NPU config also contain this?

            "attributes": {
                "scales": "unified"
            },

Maybe @alexsu52 and @AlexKoff88 can answer this question.

Mar 11 '24 12:03 nikita-malininn

Also noticed this difference

What should we do with this? Are these parameters used for NPU?

Mar 11 '24 12:03 nikita-malininn

Should NPU config also contain this?
            "attributes": {
                "scales": "unified"
            },
Maybe @alexsu52 and @AlexKoff88 can answer this question.

Good question. Scale unification should be applied only for operation from CPU hardware config to align behavior between CPU and NPU in INT8 quantization. What affects other precision, we should not change their behavior.

Mar 12 '24 06:03 alexsu52

Do you have results of the conformance test for NPU config?

Please think about what test needs to be added to check the equality of the cpu, gpu and npu configs for int8.

We have no tests for NPU in the conformance as well. I haven't run these tests.

Mar 13 '24 07:03 nikita-malininn

Just for the record, the main motivation for keeping the config is QAT for NPU which has some custom features such as W4A4 support.

Mar 18 '24 11:03 AlexKoff88

@alexsu52, please, review

May 06 '24 08:05 nikita-malininn

nncf nncf copied to clipboard

Align NPU to CPU

Changes

Reason for changes

Related tickets

Tests

Codecov Report

nncf
nncf copied to clipboard