What does this PR do?

This is a preparation PR for the introduction of OVMultiQuantizationConfig which purpose is to contain inside of it multiple quantization configs per pipeline model.

Changes:

Introduce a CalibrationDataset class representing Dict[str, nncf.Dataset]. The reasoning behind is that most recent model pipelines contain many ov.Model components and not a single model.model as it was before. To quantize such pipelines in a data-aware fashion we need a dataset per ov submodel. This is what CalibrationDataset represents.
Moved calibration dataset collection logic into a newly added OVCalibrationDatasetBuilder class. It is not exposed to users and is instantiated only inside OVQuantizer. The class supports the following methods:
- load_dataset(...) -- basically does the same as the current OVQuantizer.get_calibation_dataset() method.
- build_from_dataset_name(dataset, ...) -> CalibrationDataset -- builds nncf calibration dataset from an instance of datasets.Dataset.
- build_from_dataset_name(dataset_name, ...) -> CalibrationDataset -- builds calibration dataset by first calling load_dataset() method and then build_from_dataset() method.
- build_from_quantization_config(q_config) -> CalibrationDataset -- builds calibration dataset from just a quantization config, using q_config.dataset field and others. Relies on build_from_dataset_name().
Created a separate package at optimum.intel.openvino.quantization containing all quantization related logic split into multiple files. It becomes harder to maintain the single large quantization.py file.

Deprecations:

All changes are backward compatible. However some deprecation warnings (until v1.25) are added to avoid some rare corner case API scenarios which I believe should be avoided:

Import path optimum.intel.openvino.configuration is deprecated and should not be used from now. I believe everything needed by users is mainly imported from optimum.intel.openvino, so this is not critical. Instead optimum.intel.openvino.quantization should be used. For now the import works fine, but deprecation warning is printed in such case.
Providing calibration_dataset to OVQuantizer.quantize() as a list. Currently this is only supported for a list of prompts for hybrid quantization of diffusion models. The list is better to be provided via quantization_config.dataset field as it is done for causal models quantization.
Deprecate remove_unused_columns argument for OVQuantizer.quantize() because it relies on the fact that model.forward() is used for inference. However we can't always guarantee that. For example there is model.generate() method. In general I believe the user should remove the unnecessary columns themselves if they provide a datasets.Dataset instance which is already a quite advanced usage scenario.
Currently all columns except caption column are removed from a datasets.Dataset instance in case of a hybrid quantization. I haven't found usages like this, I believe it is not currently used, but added a deprecation warning anyway.

Future plans:

In the next PR OVMultiQuantizationConfig will be introduced. The changes implemented here will be helpful when implementing support for this type of config.

For the further future (past v1.25) I would consider to change OVQuantizer.get_calibration_dataset signature. I believe it would be better for it to return CalibrationDataset instance from either datasets.Dataset instance or a dataset name as input. This way OVQuantizer.quantize() will accept fully ready CalibrationDataset instance and batch_size and data_collator arguments can be removed. For example in the future it could look like this:

calibration_dataset: CalibrationDataset = ov_quantizer.get_calibration_dataset(
  ov_config,
  dataset_name,
  ...,
  batch_size,
  data_collator,
)
ov_quantizer.quantize(calibration_dataset, ov_config)

Before submitting

[ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ ] Did you make sure to update the documentation with your changes?
[ ] Did you write any new necessary tests?

Mar 05 '25 15:03 nikita-savelyevv

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Mar 05 '25 15:03 HuggingFaceDocBuilderDev

@l-bat Could you please help with review of this PR?

Mar 07 '25 12:03 nikita-savelyevv

OVMultiQuantizationConfig sounds too generic. As you mentioned that this is for pipeline quantization, I think it makes sense to call it OVPipelineQuantizationConfig in the future.

Mar 10 '25 06:03 AlexKoff88

I am fine with this reshuffle, but a bit concerned about the changes in the import system and BC. I noticed you should change imports in the tests anyway. Can it cause issues in the user's code or integrations we did in the past?

Mar 10 '25 06:03 AlexKoff88

I am fine with this reshuffle, but a bit concerned about the changes in the import system and BC.

Most public quantization-related entities such as

OVQuantizer
OVConfig
OVDynamicQuantizationConfig
OVMixedQuantizationConfig
OVQuantizationConfig
OVWeightQuantizationConfig

should be imported from optimum.intel directly, e.g. from optimum.intel import OVQuantizer or from optimum.intel.openvino import OVQuantizer. So if users have imported these entities like this, then it will be automatically compatible with these changes.

There are some entities like InferRequestWrapper which are not publicly exposed. I've tried to keep import for those as is for now. For example optimum.intel.openvino.configuration.py is kept for now only for import compatibility purposes.

The imports in tests and some other files were changed because there some internal objects are imported, like _DEFAULT_4BIT_CONFIGS. I've updated these imported to correctly correspond to the file reshuffle.

Mar 19 '25 10:03 nikita-savelyevv

There are some entities like InferRequestWrapper which are not publicly exposed. I've tried to keep import for those as is for now.

We use it in some notebooks and imports like from optimum.intel.openvino.quantization import InferRequestWrapper e.g. https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/distil-whisper-asr/distil-whisper-asr.ipynb

Should we adjust something in notebooks?

Mar 20 '25 04:03 eaidova

Introduce a CalibrationDataset class representing Dict[str, nncf.Dataset].

I think it makes sense to register dummy object for it in https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/utils/dummy_openvino_and_nncf_objects.py and import structure here https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/init.py#L76-L98

it is needed for import from optimum.intel namespace

Mar 20 '25 04:03 eaidova

There are some entities like InferRequestWrapper which are not publicly exposed. I've tried to keep import for those as is for now.

We use it in some notebooks and imports like from optimum.intel.openvino.quantization import InferRequestWrapper e.g. https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/distil-whisper-asr/distil-whisper-asr.ipynb

Should we adjust something in notebooks?

Yes. My plan was to update import paths in openvino_notebooks once this PR is merged.

Mar 20 '25 07:03 nikita-savelyevv

Introduce a CalibrationDataset class representing Dict[str, nncf.Dataset].

I think it makes sense to register dummy object for it in https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/utils/dummy_openvino_and_nncf_objects.py and import structure here https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/init.py#L76-L98

it is needed for import from optimum.intel namespace

Thanks. Added.

Mar 20 '25 07:03 nikita-savelyevv

@nikita-savelyevv please resolve conflicts and make sure tests are passing.

Mar 21 '25 12:03 IlyasMoutawwakil

@nikita-savelyevv please resolve conflicts and make sure tests are passing.

Done. All failed tests are due to connection errors.

Mar 21 '25 17:03 nikita-savelyevv

@IlyasMoutawwakil could you please review?

Mar 26 '25 10:03 nikita-savelyevv

@IlyasMoutawwakil This is a gentle reminder. Thanks!

Apr 01 '25 08:04 nikita-savelyevv

Decided to split into two PRs for easier review process

Apr 10 '25 10:04 nikita-savelyevv

optimum-intel
optimum-intel copied to clipboard

Introduce OVCalibrationDatasetBuilder

What does this PR do?

Changes:

Deprecations:

Future plans:

Before submitting

optimum-intel optimum-intel copied to clipboard

Introduce OVCalibrationDatasetBuilder

What does this PR do?

Changes:

Deprecations:

Future plans:

Before submitting

optimum-intel
optimum-intel copied to clipboard