[NNCF]: Optimized memory footprint by removing redundant collected statistics

Open AdiKsOnDev opened this issue 3 months ago • 29 comments

Changes

Made changes in:

nncf/common/tensor_statistics/statistic_point.py
nncf/quantization/algorithms/pipeline.py

Focus on the way I used newly created remove_statistic_point() inside pipeline.py, to see if it's up to the expectations

Reason for changes

Save space by removing "unused" statistic points associated with an algorithm

Related tickets

120377

Tests

tests/common/test_statistic_points.py was added:

[x] Test removing associated statistical points
[x] Test removing from an empty container

Closes #2557

Mar 08 '24 12:03 AdiKsOnDev

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 52.32%. Comparing base (8ad77dc) to head (aefd170). Report is 3 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #2563       +/-   ##
============================================
- Coverage    77.91%   52.32%   -25.59%     
============================================
  Files          494      494               
  Lines        45387    45387               
============================================
- Hits         35363    23749    -11614     
- Misses       10024    21638    +11614

Files	Coverage Δ
nncf/common/tensor_statistics/statistic_point.py	`93.10% <100.00%> (+1.61%)`	:arrow_up:
nncf/quantization/algorithms/algorithm.py	`100.00% <100.00%> (ø)`
nncf/quantization/algorithms/pipeline.py	`93.50% <100.00%> (-1.17%)`	:arrow_down:

... and 276 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`44.22% <77.77%> (?)`
ONNX	`34.65% <100.00%> (?)`
TENSORFLOW	`30.11% <22.22%> (+<0.01%)`	:arrow_up:
TORCH	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`87.60% <100.00%> (-0.72%)`	:arrow_down:
torch	`32.25% <ø> (-61.24%)`	:arrow_down:
tensorflow	`93.74% <ø> (ø)`
onnx	`93.07% <ø> (+93.07%)`	:arrow_up:
openvino	`0.00% <ø> (-25.77%)`	:arrow_down:
ptq	`48.47% <100.00%> (-4.58%)`	:arrow_down:

Mar 08 '24 13:03 codecov[bot]

@kshpv Are any actions expected from me any further?

Mar 08 '24 18:03 AdiKsOnDev

@kshpv Is any action expected from me any further?

Could you add a test on this functionality? You can create a file test_statistic_points.py with a test and put it in tests/common/

Also, it would be beneficial if you provide the memory usage before and after your changes for some of NNCF examples

Thank you!

Mar 08 '24 18:03 kshpv

Yes, absolutely! I shall request a re-review for another PR tomorrow (hopefully) with a test for this and I'll give the results of a benchmark before/after the change

Thank you for responding so quickly!

Mar 08 '24 18:03 AdiKsOnDev

Benchmark Examples

@kshpv

Before

before

After

Screenshot_20240309_161303

Mar 09 '24 12:03 AdiKsOnDev

@KodiaqQ Are any actions expected from me any further?

Mar 11 '24 13:03 AdiKsOnDev

Benchmark Examples

@kshpv

Before

...

After

...

Hi, @AdiKsOnDev. Thank you for your contribution. I wonder to understand, why the performance of the model was decreased after the changes?

Mar 11 '24 13:03 KodiaqQ

Benchmark Examples

@kshpv

Before

...

After

...

Hi, @AdiKsOnDev. Thank you for your contribution. I wonder to understand, why the performance of the model was decreased after the changes?

Hi, can it be that in the remove_statistical_points() the loop that's doing all the work is taking too long?

Mar 11 '24 13:03 AdiKsOnDev

@KodiaqQ I returned the deleted line. Will update you with the benchmark in a bit.

Mar 11 '24 16:03 AdiKsOnDev

Benchmarks

@KodiaqQ EDIT: I ran the benchmark again without the remove_statistical_points() and got the performance speed of 1.952 Everything else (Besides the latency and throughput) is the same as in screenshots below

Before (Taken from previous comment)

before

After

Mar 11 '24 16:03 AdiKsOnDev

@AdiKsOnDev, could you please provide memory usage measurements before & after changes?

Mar 12 '24 07:03 KodiaqQ

@AdiKsOnDev, could you please provide memory usage measurements before & after changes?

Sorry if it's a stupid question, but where can I check that?

Mar 12 '24 07:03 AdiKsOnDev

@AdiKsOnDev, could you please provide memory usage measurements before & after changes?

Sorry if it's a stupid question, but where can I check that?

You can use a memory-profiler package, for example.

Mar 12 '24 07:03 KodiaqQ

@AdiKsOnDev, could you please provide memory usage measurements before & after changes?

Sorry if it's a stupid question, but where can I check that?

You can use a memory-profiler package, for example.

Okay, thank you

Mar 12 '24 07:03 AdiKsOnDev

@KodiaqQ

Before

After

Mar 12 '24 07:03 AdiKsOnDev

Pardon, messed up the screenshots, will update in a bit

Mar 12 '24 07:03 AdiKsOnDev

@KodiaqQ

Before

After

Mar 12 '24 08:03 AdiKsOnDev

@KodiaqQ

Before

..

After

..

I think that change from ~765Mb -> ~750Mb while quantize is what we want.

Mar 12 '24 08:03 KodiaqQ

@KodiaqQ

Before

..

After

..

I think that change from ~765Mb -> ~750Mb while quantize is what we want.

Perfect! I hope we're good to go :)

Mar 12 '24 08:03 AdiKsOnDev

Internal run test_examples/295/ passed.

Mar 12 '24 11:03 KodiaqQ

@AdiKsOnDev, could you run another experiment, please? For the memory measurements in different cases. It may be done with the mobilenetv2 example, that you've used before. Here are the changes that should be enough to activate the extended scope of the algorithms:

changes.patch

With these changes, it needs to measure only memory consumption (may be done via memory-profiler as well). It would be great if you could provide only plots before and after. An example of the plot generation with memory-profiler can be found in the documentation (mprof plot command, as I can see).

Mar 12 '24 15:03 KodiaqQ

Sure, I'll try once I'm home Will update u tomorrow

Mar 12 '24 16:03 AdiKsOnDev

@KodiaqQ Pardon me for not showing the plots for the previous test, I did see them in the PyPi documentation but couldn't use them bcs of PyQt5... Anyways, here you go:

Before

After

Mar 12 '24 18:03 AdiKsOnDev

@KodiaqQ following up on the above

Mar 13 '24 09:03 AdiKsOnDev

@AdiKsOnDev, currently the latest changes break the tests. Could you fix the issues, please?

Mar 13 '24 09:03 KodiaqQ

@AdiKsOnDev, currently the latest changes break the tests. Could you fix the issues, please?

Yeah, I am already doing that

Mar 13 '24 10:03 AdiKsOnDev

@KodiaqQ Plots after the (possible) fix:

Before `remove_statistical_points()`

After `remove_statistical_points()`

Mar 13 '24 10:03 AdiKsOnDev

@KodiaqQ I read the review, will implement the changes as soon as I get the chance Thank you, have a good evening!

Mar 13 '24 14:03 AdiKsOnDev

@KodiaqQ could you check _algorithm_key property I added in the recent commit, please? I am not 100% sure if this is what you needed

Mar 15 '24 07:03 AdiKsOnDev

@KodiaqQ following up

Mar 19 '24 03:03 AdiKsOnDev

nncf nncf copied to clipboard

[NNCF]: Optimized memory footprint by removing redundant collected statistics

Changes

Reason for changes

Related tickets

Tests

Codecov Report

Benchmark Examples

Before

After

Benchmark Examples

Before

After

Benchmark Examples

Before

After

Benchmarks

Before (Taken from previous comment)

After

Before

After

Before

After

Before

After

Before

After

Before

After

Before remove_statistical_points()

After remove_statistical_points()

nncf
nncf copied to clipboard

Before `remove_statistical_points()`

After `remove_statistical_points()`