nncf icon indicating copy to clipboard operation
nncf copied to clipboard

Align NPU to CPU

Open KodiaqQ opened this issue 11 months ago • 8 comments

Changes

  • Aligned NPU configuration to CPU for 8bit quantization;

Reason for changes

  • NPU supports CPU configuration;
  • CPU configuration as the base;
  • Removed unused configurations (ConvolutionBackpropData, GroupConvolutionBackpropData);

Related tickets

  • 132512

Tests

  • TBD

KodiaqQ avatar Mar 08 '24 11:03 KodiaqQ

Codecov Report

Attention: Patch coverage is 8.33333% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 29.94%. Comparing base (ac15a65) to head (e5cfe06). Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##           develop    #2560       +/-   ##
============================================
- Coverage    62.07%   29.94%   -32.13%     
============================================
  Files          494      494               
  Lines        45775    45791       +16     
============================================
- Hits         28413    13714    -14699     
- Misses       17362    32077    +14715     
Files Coverage Δ
nncf/common/utils/dot_file_rw.py 58.33% <0.00%> (-31.95%) :arrow_down:
nncf/quantization/algorithms/min_max/algorithm.py 20.17% <9.09%> (-76.10%) :arrow_down:

... and 262 files with indirect coverage changes

Flag Coverage Δ
COMMON ?
ONNX ?
OPENVINO ?
TENSORFLOW 29.94% <8.33%> (-0.01%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 76.35% <0.00%> (-12.08%) :arrow_down:
torch 0.01% <ø> (-32.85%) :arrow_down:
tensorflow 93.74% <ø> (ø)
onnx 0.00% <ø> (-93.07%) :arrow_down:
openvino 0.00% <ø> (-94.23%) :arrow_down:
ptq 15.24% <9.09%> (-65.68%) :arrow_down:

codecov[bot] avatar Mar 08 '24 11:03 codecov[bot]

Linters fails due to https://github.com/openvinotoolkit/nncf/pull/2561

KodiaqQ avatar Mar 08 '24 11:03 KodiaqQ

Should NPU config also contain this?

            "attributes": {
                "scales": "unified"
            },

ljaljushkin avatar Mar 11 '24 12:03 ljaljushkin

Also noticed this difference image

ljaljushkin avatar Mar 11 '24 12:03 ljaljushkin

Should NPU config also contain this?

            "attributes": {
                "scales": "unified"
            },

Maybe @alexsu52 and @AlexKoff88 can answer this question.

KodiaqQ avatar Mar 11 '24 12:03 KodiaqQ

Also noticed this difference image

What should we do with this? Are these parameters used for NPU?

KodiaqQ avatar Mar 11 '24 12:03 KodiaqQ

Should NPU config also contain this?

            "attributes": {
                "scales": "unified"
            },

Maybe @alexsu52 and @AlexKoff88 can answer this question.

Good question. Scale unification should be applied only for operation from CPU hardware config to align behavior between CPU and NPU in INT8 quantization. What affects other precision, we should not change their behavior.

alexsu52 avatar Mar 12 '24 06:03 alexsu52

Do you have results of the conformance test for NPU config?

Please think about what test needs to be added to check the equality of the cpu, gpu and npu configs for int8.

We have no tests for NPU in the conformance as well. I haven't run these tests.

KodiaqQ avatar Mar 13 '24 07:03 KodiaqQ

Just for the record, the main motivation for keeping the config is QAT for NPU which has some custom features such as W4A4 support.

AlexKoff88 avatar Mar 18 '24 11:03 AlexKoff88

@alexsu52, please, review

KodiaqQ avatar May 06 '24 08:05 KodiaqQ