nncf
nncf copied to clipboard
[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model
Context
After applying quantization to the ov.Model in Neural Network Compression Framework (NNCF), the quantization parameters, including subset_size
, are dumped to the meta section of the OpenVINO IR. subset_size
represents the size of the dataset used for calibration.
https://github.com/openvinotoolkit/nncf/blob/09960b9b71a58f3277d0964531f8ee08365d7c72/nncf/openvino/quantization/quantize_model.py#L102
However, inconsistencies arise when the dataset size is less than the provided or default 'subset_size'. To address this confusion, it is proposed to also dump the actual_subset_size
, which denotes the number of data samples used to calculate activation statistics. This addition will improve clarity and accuracy in managing quantization parameters and assist in reproducing quantization results.
What needs to be done?
- Dump
actual_subset_size
parameter to ov.Model meta section. - Add tests
Example Pull Requests
No response
Resources
Contact points
@l-bat
Ticket
No response
.take
Thank you for looking into this issue! Please let us know if you have any questions or require any help.
@AiGaf1 Hi, are you still working on this task? Do you need any help? Please inform us if you do not plan to continue working on this task. Thanks!
Hello! is there any update on this issue? If not i wish to work on this issue.
@l-bat could you please reassign the issue to @RitikaxShakya? I lack the permissions for NNCF repository.
.take
Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.
Hello @RitikaxShakya, are you still working on that issue? Do you need any help?
.take
Thank you for looking into this issue! Please let us know if you have any questions or require any help.
Hi @l-bat I created a PR after testing locally, and the XML output is as expected:
I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions
- which file I should edit to add my unit test.
- how to implement the unit test, should I check the output XML to verify whether the
actual_subset_size
property exists?
Thank you very much!
Hi @awayzjj! Thanks for your contribution!
calibration_dataset.get_length()
returns the size of the dataset that was provided to the nncf.quantize
method, however actual_subset_size
should show the number of data samples that were used to calculate the activation statistics.
In the case of calibration_dataset.get_length() >= subset_size
, actual_subset_size
is equal to subset_size
. Otherwise, actual_subset_size
must be equal to calibration_dataset.get_length()
. But it is not possible to use the get_length()
method if __len__()
is not implemented. Please take a look at https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/nncf/common/tensor_statistics/aggregator.py#L50-L53. You can implement the get_actual_subset_size()
function.
I ran the pytest in tests/openvino, and the original tests did not break. But I have 2 questions
- which file I should edit to add my unit test.
- how to implement the unit test, should I check the output XML to verify whether the actual_subset_size property exists?
- You can add test to https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py
- You can use the test as an example https://github.com/openvinotoolkit/nncf/blob/e8ea2521663de807d654ae4f375d20c904755061/tests/openvino/native/quantization/test_quantization_pipeline.py#L178-L199
@l-bat Hi, I've been really busy lately, so I've decided to unassign myself for now. I apologize for any inconvenience this may cause.