mlx-swift-examples
mlx-swift-examples copied to clipboard
Qwen3 with heterogenous quant doesn't work
The following model mlx-community/Qwen3-1.7B-4bit-AWQ doesn't run in the mlx-swift-examples repo doesn't run. It fails with a mismatched shape error in the scales. I suspect it's due to the heterogenous quant not being parsed properly. See e.g. https://huggingface.co/mlx-community/Qwen3-1.7B-4bit-AWQ/blob/main/config.json#L20
Ah yes, the quant code has no idea what to do with that (yet) -- I haven't seen this format before.
Yea it's relatively new. We use a custom class predicate in mlx-lm which holds the config and reads it to figure out what parameters to use for a given layer.
And in mlx the nn.quantize takes a class predicate which can return either True/Falsse or the quantization parameters.
See e.g.
https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/utils.py#L201-L208 https://github.com/ml-explore/mlx/blob/main/python/mlx/nn/layers/quantized.py#L29-L34
This is going to be especially useful here because more heterogenous + AWQ quants make a much bigger difference for smaller models.