AutoAWQ
AutoAWQ copied to clipboard
"Clarification on Multimodal Model Quantization and Default Calibration Dataset"
Hello,
I have a few questions regarding the quantization of multimodal models:
1、Does the current version of AutoAWQ quantize only the language model, or does it also include the vision component for quantization? 2、What is the default calibration dataset used for quantization? 3、I noticed that the example code for Qwen2-VL uses a custom multimodal dataset. Is this dataset required for all multimodal model quantizations, or can we use the default dataset? Thank you for your clarification!
Hi, @donghong1
- As far as I know, AWQ itself quantizes only the language model, not vision encoders. I saw the paper proposed quantizing vision encoders but not sure which paper is.
- It seems the authors uses
pile-val-backupdataset in there paper. - I am not 100% sure but I guess we should multimodal dataset for vlms.