Prince Canuma

Results 97 comments of Prince Canuma

> I am not sure if we don't need it, just as an example, if I want to load the model that is expected to use the default chat template....

How many times have you encountered this issue? And could you point me to some example chat/instruct models that also have this issue?

`Chat_template` is the standard now. Check majority of SFT chat/instruct models. All of them have it set. Any base model fine-tuned for chat/instruction using TRL, Axolotl, llama factory and so...

> It's not a bug.. at the risk of being redundant, the last dimension of the matrix has to be divisible by the quantization group size. For the size 4304...

Is there a way to skip particular target layer or Block X in the model in MLX? Not all layers of the same type like class_predicate does.

It works wonders! 💯 Also found a better way, skipping the entire block: ``` class_predicate = lambda p, m: isinstance(m, nn.Linear) and p.split('.')[0] != "vision_tower" ```