Enable autoround quantization
Needs neural-compressor >= v2.6.0 to avoid incompatibility with itrex
coming from missing n_samples argument and train_bs / bs mismatch :
https://github.com/intel/intel-extension-for-transformers/blob/v1.4.2/intel_extension_for_transformers/transformers/llm/quantization/utils.py#L397
https://github.com/intel/neural-compressor/blob/v2.5.1/neural_compressor/adaptor/torch_utils/auto_round.py#L19
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
ITREX was deprecated in : https://github.com/huggingface/optimum-intel/pull/880