nncf
nncf copied to clipboard
Align NPU to CPU
Changes
- Aligned NPU configuration to CPU for 8bit quantization;
Reason for changes
- NPU supports CPU configuration;
- CPU configuration as the base;
- Removed unused configurations (ConvolutionBackpropData, GroupConvolutionBackpropData);
Related tickets
- 132512
Tests
- TBD
Codecov Report
Attention: Patch coverage is 8.33333%
with 11 lines
in your changes are missing coverage. Please review.
Project coverage is 29.94%. Comparing base (
ac15a65
) to head (e5cfe06
). Report is 1 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #2560 +/- ##
============================================
- Coverage 62.07% 29.94% -32.13%
============================================
Files 494 494
Lines 45775 45791 +16
============================================
- Hits 28413 13714 -14699
- Misses 17362 32077 +14715
Files | Coverage Δ | |
---|---|---|
nncf/common/utils/dot_file_rw.py | 58.33% <0.00%> (-31.95%) |
:arrow_down: |
nncf/quantization/algorithms/min_max/algorithm.py | 20.17% <9.09%> (-76.10%) |
:arrow_down: |
... and 262 files with indirect coverage changes
Flag | Coverage Δ | |
---|---|---|
COMMON | ? |
|
ONNX | ? |
|
OPENVINO | ? |
|
TENSORFLOW | 29.94% <8.33%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Components | Coverage Δ | |
---|---|---|
common | 76.35% <0.00%> (-12.08%) |
:arrow_down: |
torch | 0.01% <ø> (-32.85%) |
:arrow_down: |
tensorflow | 93.74% <ø> (ø) |
|
onnx | 0.00% <ø> (-93.07%) |
:arrow_down: |
openvino | 0.00% <ø> (-94.23%) |
:arrow_down: |
ptq | 15.24% <9.09%> (-65.68%) |
:arrow_down: |
Linters fails due to https://github.com/openvinotoolkit/nncf/pull/2561
Should NPU config also contain this?
"attributes": {
"scales": "unified"
},
Also noticed this difference
Should NPU config also contain this?
"attributes": { "scales": "unified" },
Maybe @alexsu52 and @AlexKoff88 can answer this question.
Also noticed this difference
What should we do with this? Are these parameters used for NPU?
Should NPU config also contain this?
"attributes": { "scales": "unified" },
Maybe @alexsu52 and @AlexKoff88 can answer this question.
Good question. Scale unification should be applied only for operation from CPU hardware config to align behavior between CPU and NPU in INT8 quantization. What affects other precision, we should not change their behavior.
Do you have results of the conformance test for NPU config?
Please think about what test needs to be added to check the equality of the cpu, gpu and npu configs for int8.
We have no tests for NPU in the conformance as well. I haven't run these tests.
Just for the record, the main motivation for keeping the config is QAT for NPU which has some custom features such as W4A4 support.
@alexsu52, please, review