mlx-vlm icon indicating copy to clipboard operation
mlx-vlm copied to clipboard

Add AWQ/DWQ for Vision Models

Open Blaizzy opened this issue 9 months ago • 0 comments

Investigate and implement Activation-aware Weight Quantization (AWQ) and Dynamic Weight Quantization (DWQ) techniques specifically for vision models. Motivation: Vision models often have larger parameter counts and compute requirements. Effective quantization techniques like AWQ and DWQ could significantly reduce the compute and memory footprint while maintaining acceptable quality, particularly important for multimodal applications.

Blaizzy avatar May 06 '25 22:05 Blaizzy