nncf
nncf copied to clipboard
[Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests
Context
The issue proposes adding additional check to post-training compression conformance suite, particularly for
weight compression part.
Currently, it counts number of FakeQuantize ops only: code
But it's only applicable for quantization algorithms, weight compression algorithm creates u4 or u8 constants instead.
What needs to be done?
- Add a function for counting number of
u4andu8ops similar to how it's done in the Optimum tests. get_num_quantized_nodes - Integrate this function to conformance test's hierarchy Refer to the get_num_compressed for quantization algorithm.
- Add a check that actual number of u4 and u8 ops matches the reference values. Refer to the check for the accuracy: https://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/pipelines/base.py#L285
- add number of u4 and u8 to
results.csvhttps://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/test_quantize_conformance.py#L123 Example of currentresults.csv:
Model,Backend,Metric name,Metric value,Metric diff,RAM MiB,Compr. time,Stat. collection time,Mixed-Precision search time,AWQ time,Apply Compression time,Total time,Status
tinyllama_data_aware_awq_stateful,OV,Similarity,0.81229,-0.18771,31028,0:02:32,0:00:55,0:00:19,0:00:48,0:00:23,0:04:48,
More human-readable visualization:
Please refer to the readme how to run tests: https://github.com/openvinotoolkit/nncf/blob/develop/tests/post_training/README.md#usage
Example Pull Requests
https://github.com/openvinotoolkit/nncf/pull/2463 https://github.com/openvinotoolkit/nncf/pull/2425
Resources
Contact points
@ljaljushkin
Ticket
132562
.take
Thank you for looking into this issue! Please let us know if you have any questions or require any help.
Hi @ljaljushkin, I posted https://github.com/openvinotoolkit/nncf/pull/2571 for tracking and reporting u8 and u4.
For item 3 "Add a check that actual number of u4 and u8 ops matches the reference values": Is the idea to add the check to prevent regression (For example, u4 and u8 ops should not increase overtime)? I don't see reference values defined for now. Should I use the current counts as the reference values?
Hi @ljaljushkin, I posted #2571 for tracking and reporting u8 and u4.
For item 3 "Add a check that actual number of u4 and u8 ops matches the reference values": Is the idea to add the check to prevent regression (For example, u4 and u8 ops should not increase overtime)? I don't see reference values defined for now. Should I use the current counts as the reference values?
Greetings @YutingGao7! Thanks for the PR! Yes, you are correct. Number of u4 and u8 shouldn't be changed overtime. Feel free to use the current counts as references. I will double-check them manually.
The assignee was unassigned due to the lack of activity.
.take
Thank you for looking into this issue! Please let us know if you have any questions or require any help.
Is there any link from where I can download the ImageNet dataset already formatted for testing?
Is there any link from where I can download the ImageNet dataset already formatted for testing?
Hello, @DaniAffCH! I believe you don't need such a big dataset in this context.
The task is mostly about weight compression (weight only compression) tests that validate TinyLLama model on ~20 prompts only.
Potentially, you can affect post-training quantization (or PTQ: weight + activation quantization) tests.
It that case, you can check them without ImageNet. For example, run tests that download quite small sst2 dataset:
test_ptq_quantization[hf/hf-internal-testing/tiny-random-GPTNeoXForCausalLM_backend_OPTIMUM] which uses sst2
or
test_quantize_conformance.py::test_ptq_quantization[hf/bert-base-uncased_backend_OV]
Thank you for the clarification, I opened a PR for this!