Awni Hannun

Results 1014 comments of Awni Hannun

Awesome, I will try with the chat template! If you are able to upload the 4-bit version w/ the chat template to the MLX Community I think that would be...

- Fixed rope to traditional - Fixed an issue with layer norm upcasting to fp32 - Rebased on main + ran formatting

> Btw, could you explain what is the difference between rope traditional on and off? When should I use one vs the other? Also, what output did you get with...

> Out of curiosity and possibly showing my ignorance, why not use mx.fast.scaled_dot_product_attention for phi and phixtral as well? Yes it's really important for phi to upcast the queries (and...

Cool! Although I'm wondering how that will go to do encoder/decoder style models in MLX LM. We have a [T5 example](https://github.com/ml-explore/mlx-examples/tree/main/t5) you can use as a reference. If it doesn't...

We don't have such an operation, sorry! You could do something like: ```python def nansum(x): return mx.sum(mx.where(mx.isnan(x), 0, x)) ```

Just curious, what cases does this fix that currently do not work? Don't the instruction tuned models have the template in the tokenizer? This would fix the commandR issue for...

@Blaizzy @mzbac it sounds like there is still an issue here? Are you intending to send a fix?

🤔 so if I understand correctly, this PR will improve the default setting for some models (by using the default chat template) but for other models it might be worse...

> The HF tokenizer.apply_chat_template is showing a warning when applying the default template Nice, that's useful to know! > I am happy to close this PR for now if we...