TensorRT-LLM Support for Cohere Command-R

Support for Cohere Command-R

Open tombolano opened this issue 11 months ago • 2 comments

Cohere released the model "Command-R", a multilingual model optimized for long context tasks such as retrieval augmented generation (RAG) and using external APIs and tools.

Release note: https://txt.cohere.com/command-r/ Weights: https://huggingface.co/CohereForAI/c4ai-command-r-v01

The evaluation results shown by Cohere are really good, it beats Mixtral, Llama2 70B, and ChatGPT 3.5 for RAG and tool usage tasks.

In the llama.cpp repository there is a discussion (https://github.com/ggerganov/llama.cpp/pull/6033) that provides some useful comments about its implementation.