Prince Canuma comments

Results 97 comments of


                                            Prince Canuma

chore(mlx-lm): enable to apply default chat template

> I am not sure if we don't need it, just as an example, if I want to load the model that is expected to use the default chat template....

chore(mlx-lm): enable to apply default chat template

How many times have you encountered this issue? And could you point me to some example chat/instruct models that also have this issue?

chore(mlx-lm): enable to apply default chat template

`Chat_template` is the standard now. Check majority of SFT chat/instruct models. All of them have it set. Any base model fine-tuned for chat/instruction using TRL, Axolotl, llama factory and so...

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

> It's not a bug.. at the risk of being redundant, the last dimension of the matrix has to be divisible by the quantization group size. For the size 4304...

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

Is there a way to skip particular target layer or Block X in the model in MLX? Not all layers of the same type like class_predicate does.

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

Thank you very much, I will give it a try ASAP!

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

It works wonders! 💯 Also found a better way, skipping the entire block: ``` class_predicate = lambda p, m: isinstance(m, nn.Linear) and p.split('.')[0] != "vision_tower" ```