nncf icon indicating copy to clipboard operation
nncf copied to clipboard

[Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests

Open ljaljushkin opened this issue 1 year ago • 4 comments

Context

The issue proposes adding additional check to post-training compression conformance suite, particularly for weight compression part. Currently, it counts number of FakeQuantize ops only: code But it's only applicable for quantization algorithms, weight compression algorithm creates u4 or u8 constants instead.

What needs to be done?

  1. Add a function for counting number of u4 and u8 ops similar to how it's done in the Optimum tests. get_num_quantized_nodes
  2. Integrate this function to conformance test's hierarchy Refer to the get_num_compressed for quantization algorithm.
  3. Add a check that actual number of u4 and u8 ops matches the reference values. Refer to the check for the accuracy: https://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/pipelines/base.py#L285
  4. add number of u4 and u8 to results.csv https://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/test_quantize_conformance.py#L123 Example of current results.csv:
Model,Backend,Metric name,Metric value,Metric diff,RAM MiB,Compr. time,Stat. collection time,Mixed-Precision search time,AWQ time,Apply Compression time,Total time,Status
tinyllama_data_aware_awq_stateful,OV,Similarity,0.81229,-0.18771,31028,0:02:32,0:00:55,0:00:19,0:00:48,0:00:23,0:04:48,

More human-readable visualization: image

Please refer to the readme how to run tests: https://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/README.md#usage

Example Pull Requests

https://github.com/openvinotoolkit/nncf/pull/2463 https://github.com/openvinotoolkit/nncf/pull/2425

Resources

Contact points

@ljaljushkin

Ticket

132562

ljaljushkin avatar Mar 08 '24 13:03 ljaljushkin

.take

YutingGao7 avatar Mar 10 '24 19:03 YutingGao7

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

github-actions[bot] avatar Mar 10 '24 19:03 github-actions[bot]

Hi @ljaljushkin, I posted https://github.com/openvinotoolkit/nncf/pull/2571 for tracking and reporting u8 and u4.

For item 3 "Add a check that actual number of u4 and u8 ops matches the reference values": Is the idea to add the check to prevent regression (For example, u4 and u8 ops should not increase overtime)? I don't see reference values defined for now. Should I use the current counts as the reference values?

YutingGao7 avatar Mar 11 '24 18:03 YutingGao7

Hi @ljaljushkin, I posted #2571 for tracking and reporting u8 and u4.

For item 3 "Add a check that actual number of u4 and u8 ops matches the reference values": Is the idea to add the check to prevent regression (For example, u4 and u8 ops should not increase overtime)? I don't see reference values defined for now. Should I use the current counts as the reference values?

Greetings @YutingGao7! Thanks for the PR! Yes, you are correct. Number of u4 and u8 shouldn't be changed overtime. Feel free to use the current counts as references. I will double-check them manually.

ljaljushkin avatar Mar 12 '24 10:03 ljaljushkin

The assignee was unassigned due to the lack of activity.

ljaljushkin avatar Apr 03 '24 09:04 ljaljushkin

.take

DaniAffCH avatar Apr 03 '24 10:04 DaniAffCH

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

github-actions[bot] avatar Apr 03 '24 10:04 github-actions[bot]

Is there any link from where I can download the ImageNet dataset already formatted for testing?

DaniAffCH avatar Apr 05 '24 11:04 DaniAffCH

Is there any link from where I can download the ImageNet dataset already formatted for testing?

Hello, @DaniAffCH! I believe you don't need such a big dataset in this context. The task is mostly about weight compression (weight only compression) tests that validate TinyLLama model on ~20 prompts only. Potentially, you can affect post-training quantization (or PTQ: weight + activation quantization) tests. It that case, you can check them without ImageNet. For example, run tests that download quite small sst2 dataset: test_ptq_quantization[hf/hf-internal-testing/tiny-random-GPTNeoXForCausalLM_backend_OPTIMUM] which uses sst2 or test_quantize_conformance.py::test_ptq_quantization[hf/bert-base-uncased_backend_OV]

ljaljushkin avatar Apr 05 '24 12:04 ljaljushkin

Thank you for the clarification, I opened a PR for this!

DaniAffCH avatar Apr 05 '24 15:04 DaniAffCH