auto-round
auto-round copied to clipboard
hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization
Feature request 1 support different kernels in different backend, including gptq/awq/itrex 2 support different bits and group_size for different layers