auto-round hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization

hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization

Open wenhuach21 opened this issue 1 month ago • 1 comments

Feature request 1 support different kernels in different backend, including gptq/awq/itrex 2 support different bits and group_size for different layers

May 16 '24 01:05 wenhuach21

auto-round auto-round copied to clipboard

hook AutoHfQuantizer of transformers to support different backends and mixed precision quantization

auto-round
auto-round copied to clipboard