ScaleLLM
ScaleLLM copied to clipboard
Is there any plans to support Int8 weight quant ?
The int8 quantization is basically lossless at the moment, and although awq is good in terms of accuracy and performance, int8 is a better choice in some scenarios where accuracy is more important. So I'd like to ask if there are plans to support this feature