swift icon indicating copy to clipboard operation
swift copied to clipboard

[WIP] Support Hqq and Eetq quantization

Open hjh0119 opened this issue 1 month ago • 0 comments

PR type

  • [x] Bug Fix
  • [x] New Feature
  • [x] Document Updates
  • [ ] More Models or Datasets Support

PR information

New quantization algorithms:

Uniform quantization bit parameter as quantization_bit.

  • The quant_bits parameter is deprecated; for compatibility purposes, it has been synchronized to quantization_bit.

To resolve the garbled output issue during inference with internvl-chat-v1.5-int8 on V100 machines, fix the dtype to bf16.

  • https://github.com/modelscope/swift/issues/890

hjh0119 avatar May 10 '24 03:05 hjh0119