onnxruntime [python] Include 'per_channel' attribute when calibrating

Describe the issue

Currently, calibration algorithms compute scalar ranges, i.e. a single value for each calibration tensor. Unfortunately, there is no option for per_channel = True when calibrating.

It would be interesting to have a 'per_channel' option in the initialization parameters of any 'Calibrater' class to produce a set of calibration vectors instead of a single scalar.

Motivation

Some models could improve their performance if the quantization parameters (zero_point/scale) are vectors instead of scalars.

What is expected ?

Include for CalibraterBase class the attribute per_channel = False:

class CalibraterBase:
    def __init__(
        self,
        model,
        op_types_to_calibrate: Optional[Sequence[str]] = None,
        augmented_model_path="augmented_model.onnx",
        symmetric=False,
        use_external_data_format=False,
        per_channel=False,
    ):
        """
        :param model: ONNX model to calibrate. It can be a ModelProto or a model path
        :param op_types_to_calibrate: operator types to calibrate. By default, calibrate all the float32/float16 tensors.
        :param augmented_model_path: save augmented model to this path.
        :param symmetric: make range of tensor symmetric (central point is 0).
        :param use_external_data_format: use external data format to store model which size is >= 2Gb.
        :param per_channel: whether to compute range as vector.
        """
    ...

And update calibration algorithms in order to reproduce the new feature.

Jan 22 '24 15:01 Johansmm

Currently, I am working in a proof-of-concept to explain better the expected solution for MinMaxCalibrater.

Jan 22 '24 15:01 Johansmm

@Johansmm, we now only support per-channel for weight not activation. The reason is that it is not easy to make underlying kernel faster to support per-channel for both weight and activation. It would be great if you can contribute.

Jan 22 '24 16:01 yufenglee

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

Feb 22 '24 15:02 github-actions[bot]

onnxruntime onnxruntime copied to clipboard

[python] Include 'per_channel' attribute when calibrating

Describe the issue

Motivation

What is expected ?

onnxruntime
onnxruntime copied to clipboard