quantize models with large context
I want to quantize the CodeQwen model using a custom dataset, but all sample lengths exceed 512. Why doesn't AWQ support sample with lengths longer than 512? Are there any alternative methods for quantizing models with large context?
You can see my github AutoAWQ-with-quantizer(with autoawq==0.24) code to change quantizer. https://github.com/WanBenLe/AutoAWQ-with-quantizer
This is on the roadmap as the next development. I want to rework how calibration is executed completely and document it.
I just spotted this issue as well, inputs to block size are capped at 512, regardless of block size provided: https://github.com/casper-hansen/AutoAWQ/blob/5f3785dcaa107ca76f5fa5355f459370c86f82d6/awq/utils/calib_data.py#L50