nncf icon indicating copy to clipboard operation
nncf copied to clipboard

[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model

Open l-bat opened this issue 11 months ago • 13 comments

Context

After applying quantization to the ov.Model in Neural Network Compression Framework (NNCF), the quantization parameters, including subset_size, are dumped to the meta section of the OpenVINO IR. subset_size represents the size of the dataset used for calibration. https://github.com/openvinotoolkit/nncf/blob/09960b9b71a58f3277d0964531f8ee08365d7c72/nncf/openvino/quantization/quantize_model.py#L102

However, inconsistencies arise when the dataset size is less than the provided or default 'subset_size'. To address this confusion, it is proposed to also dump the actual_subset_size, which denotes the number of data samples used to calculate activation statistics. This addition will improve clarity and accuracy in managing quantization parameters and assist in reproducing quantization results.

What needs to be done?

  • Dump actual_subset_size parameter to ov.Model meta section.
  • Add tests

Example Pull Requests

No response

Resources

Contact points

@l-bat

Ticket

No response

l-bat avatar Mar 08 '24 11:03 l-bat

.take

AiGaf1 avatar Mar 09 '24 14:03 AiGaf1

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

github-actions[bot] avatar Mar 09 '24 14:03 github-actions[bot]

@AiGaf1 Hi, are you still working on this task? Do you need any help? Please inform us if you do not plan to continue working on this task. Thanks!

andrey-churkin avatar Mar 21 '24 17:03 andrey-churkin

Hello! is there any update on this issue? If not i wish to work on this issue.

RitikaxShakya avatar Mar 29 '24 11:03 RitikaxShakya

@l-bat could you please reassign the issue to @RitikaxShakya? I lack the permissions for NNCF repository.

p-wysocki avatar Apr 03 '24 18:04 p-wysocki

.take

RitikaxShakya avatar Apr 04 '24 06:04 RitikaxShakya

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

github-actions[bot] avatar Apr 04 '24 06:04 github-actions[bot]

Hello @RitikaxShakya, are you still working on that issue? Do you need any help?

p-wysocki avatar May 06 '24 09:05 p-wysocki

.take

awayzjj avatar Jun 27 '24 23:06 awayzjj

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

github-actions[bot] avatar Jun 27 '24 23:06 github-actions[bot]

Hi @l-bat I created a PR after testing locally, and the XML output is as expected: 截屏2024-06-29 11 46 58

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

  1. which file I should edit to add my unit test.
  2. how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?

Thank you very much!

awayzjj avatar Jun 29 '24 04:06 awayzjj

Hi @awayzjj! Thanks for your contribution!

calibration_dataset.get_length() returns the size of the dataset that was provided to the nncf.quantize method, however actual_subset_size should show the number of data samples that were used to calculate the activation statistics. In the case of calibration_dataset.get_length() >= subset_size, actual_subset_size is equal to subset_size. Otherwise, actual_subset_size must be equal to calibration_dataset.get_length(). But it is not possible to use the get_length() method if __len__() is not implemented. Please take a look at https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/nncf/common/tensor_statistics/aggregator.py#L50-L53. You can implement the get_actual_subset_size() function.

l-bat avatar Jul 01 '24 12:07 l-bat

I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions

  1. which file I should edit to add my unit test.
  2. how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?
  1. You can add test to https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py
  2. You can use the test as an example https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py#L178-L199

l-bat avatar Jul 01 '24 12:07 l-bat

@l-bat Hi, I've been really busy lately, so I've decided to unassign myself for now. I apologize for any inconvenience this may cause.

awayzjj avatar Aug 26 '24 14:08 awayzjj