quantize models with large context

Open chennnM opened this issue 1 year ago • 3 comments

I want to quantize the CodeQwen model using a custom dataset, but all sample lengths exceed 512. Why doesn't AWQ support sample with lengths longer than 512? Are there any alternative methods for quantizing models with large context?

Jun 03 '24 04:06 chennnM

You can see my github AutoAWQ-with-quantizer(with autoawq==0.24) code to change quantizer. https://github.com/WanBenLe/AutoAWQ-with-quantizer

Jun 03 '24 06:06 WanBenLe

This is on the roadmap as the next development. I want to rework how calibration is executed completely and document it.

Jun 03 '24 07:06 casper-hansen

I just spotted this issue as well, inputs to block size are capped at 512, regardless of block size provided: https://github.com/casper-hansen/AutoAWQ/blob/5f3785dcaa107ca76f5fa5355f459370c86f82d6/awq/utils/calib_data.py#L50

Jun 04 '24 11:06 edbeeching