TensorRT-LLM
TensorRT-LLM copied to clipboard
Support for Cohere Command-R
Cohere released the model "Command-R", a multilingual model optimized for long context tasks such as retrieval augmented generation (RAG) and using external APIs and tools.
Release note: https://txt.cohere.com/command-r/ Weights: https://huggingface.co/CohereForAI/c4ai-command-r-v01
The evaluation results shown by Cohere are really good, it beats Mixtral, Llama2 70B, and ChatGPT 3.5 for RAG and tool usage tasks.
In the llama.cpp repository there is a discussion (https://github.com/ggerganov/llama.cpp/pull/6033) that provides some useful comments about its implementation.
They now also released a larger, 104B parameter model: C4AI Command R+
Yeah, c4ai-command R+ is really nedd trt llm to support, and no more work for llama2, as the development of llm is super fast, Gays please hurry up