Prince Canuma comments

Results 151 comments of


                                            Prince Canuma

Add support for Cohere's Command-R

Done the 4bit model with updated tokenizer is available in th hub. Link: [mlx-community/c4ai-command-r-v01-4bit](https://huggingface.co/mlx-community/c4ai-command-r-v01-4bit)

chore(mlx-lm): enable to apply default chat template

Thanks! Indeed you have a point, and I see we all converging to a defacto template. However, what happens when it's a pretrained model without any instruction-tuning? Won't defaulting (fallback)...

chore(mlx-lm): enable to apply default chat template

Makes sense to me :) As a ML Engineer this is intuitive. I'm just thinking about UX for the users at large that might understand distinction. They could just run...

chore(mlx-lm): enable to apply default chat template

> Currently, the implementation only applies the chat template when the tokenizer has an explicitly specified chat template in the tokenizer configuration or implementation. Recently released models tend to use...

chore(mlx-lm): enable to apply default chat template

@mzbac Got it, It works with both pre-trained and instruction models. 👍🏽 But I noticed that with starcoder2-3b it gives inconsistent results. ```Python python -m mlx_lm.generate --model mlx-community/starcoder2-3b-4bit --prompt "Write...

chore(mlx-lm): enable to apply default chat template

Found the issue, without the condition to check `tokenizer.chat_template is not None`, the condition becomes `True` which trigger this warning: `No chat template is defined for this tokenizer - using...

chore(mlx-lm): enable to apply default chat template

> @Blaizzy @mzbac it sounds like there is still an issue here? Are you intending to send a fix? I was a unwell but I'm back. @mzbac's last commit ([df1eb23](https://github.com/ml-explore/mlx-examples/pull/577/commits/df1eb2304e8b09f734ba285a617736a1d44c2376))...