onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

[python] Include 'per_channel' attribute when calibrating

Open Johansmm opened this issue 1 year ago • 3 comments

Describe the issue

Currently, calibration algorithms compute scalar ranges, i.e. a single value for each calibration tensor. Unfortunately, there is no option for per_channel = True when calibrating.

It would be interesting to have a 'per_channel' option in the initialization parameters of any 'Calibrater' class to produce a set of calibration vectors instead of a single scalar.

Motivation

Some models could improve their performance if the quantization parameters (zero_point/scale) are vectors instead of scalars.

What is expected ?

Include for CalibraterBase class the attribute per_channel = False:

class CalibraterBase:
    def __init__(
        self,
        model,
        op_types_to_calibrate: Optional[Sequence[str]] = None,
        augmented_model_path="augmented_model.onnx",
        symmetric=False,
        use_external_data_format=False,
        per_channel=False,
    ):
        """
        :param model: ONNX model to calibrate. It can be a ModelProto or a model path
        :param op_types_to_calibrate: operator types to calibrate. By default, calibrate all the float32/float16 tensors.
        :param augmented_model_path: save augmented model to this path.
        :param symmetric: make range of tensor symmetric (central point is 0).
        :param use_external_data_format: use external data format to store model which size is >= 2Gb.
        :param per_channel: whether to compute range as vector.
        """
    ...

And update calibration algorithms in order to reproduce the new feature.

Johansmm avatar Jan 22 '24 15:01 Johansmm

Currently, I am working in a proof-of-concept to explain better the expected solution for MinMaxCalibrater.

Johansmm avatar Jan 22 '24 15:01 Johansmm

@Johansmm, we now only support per-channel for weight not activation. The reason is that it is not easy to make underlying kernel faster to support per-channel for both weight and activation. It would be great if you can contribute.

yufenglee avatar Jan 22 '24 16:01 yufenglee

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions[bot] avatar Feb 22 '24 15:02 github-actions[bot]