Prince Canuma comments

Results 151 comments of


                                            Prince Canuma

Command-R-Plus, Context Window Limitations

> Hey @Blaizzy, I have run the exact same test with the new llama.cpp implementation of Command-R+ and it works way above 8k tokens. @jeanromainroy can you try again with...

Command-R-Plus, Context Window Limitations

You can also try to increase the default `max_position_embeddings` and let me know if it works.

Command-R-Plus, Context Window Limitations

Let me know how it goes, but for now according to your report the issue should be fixed.

Command-R-Plus, Context Window Limitations

> Hey @Blaizzy , I tried your fork and the model is still outputting ... when I provide a long prompt. I have made a new change, can you try...

Command-R-Plus, Context Window Limitations

Wait, I think I got it! Give me 30 min :)

Command-R-Plus, Context Window Limitations

@jeanromainroy can you try this branch, the previous one had a git issue: https://github.com/Blaizzy/mlx-examples/tree/pc/command-R

Command-R-Plus, Context Window Limitations

Only PAD ? Can you share the whole output?

Command-R-Plus, Context Window Limitations

Got it! @awni the cohere team added `model_max_length` set to 128K on both command-r models. Is there a way of setting using this number with the nn.Rope? Are there any...

Unable to convert CogVLM due to the model not existing.

You can move this issue here: https://github.com/Blaizzy/mlx-vllm I'm building a vllm specific package for MLX models.

Can't load BnB models

@NanoCode012 Could you let me know what else are you looking for?