nncf [Good First Issue][NNCF]: Dump actual_subset

Context

After applying quantization to the ov.Model in Neural Network Compression Framework (NNCF), the quantization parameters, including subset_size, are dumped to the meta section of the OpenVINO IR. subset_size represents the size of the dataset used for calibration. https://github.com/openvinotoolkit/nncf/blob/09960b9b71a58f3277d0964531f8ee08365d7c72/nncf/openvino/quantization/quantize_model.py#L102

However, inconsistencies arise when the dataset size is less than the provided or default 'subset_size'. To address this confusion, it is proposed to also dump the actual_subset_size, which denotes the number of data samples used to calculate activation statistics. This addition will improve clarity and accuracy in managing quantization parameters and assist in reproducing quantization results.

What needs to be done?

Dump actual_subset_size parameter to ov.Model meta section.
Add tests

Example Pull Requests

No response

Resources

Contribution guide - start here!

Contact points

@l-bat

Ticket

No response

Mar 08 '24 11:03 l-bat

.take

Mar 09 '24 14:03 AiGaf1

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Mar 09 '24 14:03 github-actions[bot]

@AiGaf1 Hi, are you still working on this task? Do you need any help? Please inform us if you do not plan to continue working on this task. Thanks!

Mar 21 '24 17:03 andrey-churkin

Hello! is there any update on this issue? If not i wish to work on this issue.

Mar 29 '24 11:03 RitikaxShakya

@l-bat could you please reassign the issue to @RitikaxShakya? I lack the permissions for NNCF repository.

Apr 03 '24 18:04 p-wysocki

.take

Apr 04 '24 06:04 RitikaxShakya

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

Apr 04 '24 06:04 github-actions[bot]

Hello @RitikaxShakya, are you still working on that issue? Do you need any help?

May 06 '24 09:05 p-wysocki

.take

Jun 27 '24 23:06 awayzjj

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Jun 27 '24 23:06 github-actions[bot]

Hi @l-bat I created a PR after testing locally, and the XML output is as expected: 截屏2024-06-29 11 46 58

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

which file I should edit to add my unit test.
how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?

Thank you very much!

Jun 29 '24 04:06 awayzjj

Hi @awayzjj! Thanks for your contribution!

calibration_dataset.get_length() returns the size of the dataset that was provided to the nncf.quantize method, however actual_subset_size should show the number of data samples that were used to calculate the activation statistics. In the case of calibration_dataset.get_length() >= subset_size, actual_subset_size is equal to subset_size. Otherwise, actual_subset_size must be equal to calibration_dataset.get_length(). But it is not possible to use the get_length() method if __len__() is not implemented. Please take a look at https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/nncf/common/tensor_statistics/aggregator.py#L50-L53. You can implement the get_actual_subset_size() function.

Jul 01 '24 12:07 l-bat

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

which file I should edit to add my unit test.

how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?

You can add test to https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py
You can use the test as an example https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py#L178-L199

Jul 01 '24 12:07 l-bat

@l-bat Hi, I've been really busy lately, so I've decided to unassign myself for now. I apologize for any inconvenience this may cause.

Aug 26 '24 14:08 awayzjj

nncf
nncf copied to clipboard

[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket

nncf nncf copied to clipboard

[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket

nncf
nncf copied to clipboard