TensorRT-LLM
TensorRT-LLM copied to clipboard
[Feature request] Cohere Family of Models (Command-R, Command-R-Plus, Aya23-8B, Aya23-35B, Aya101)
Hello,
I am creating this issue for the purpose of requesting support for the Cohere family of models:
Command-R: https://huggingface.co/CohereForAI/c4ai-command-r-v01 Command-R-Plus: https://huggingface.co/CohereForAI/c4ai-command-r-plus Aya23-8B: https://huggingface.co/CohereForAI/aya-23-8B Aya23-35B: https://huggingface.co/CohereForAI/aya-23-35B Aya101: https://huggingface.co/CohereForAI/aya-101
Thank you
I would also like to request support for CommandR and CommandR+ plus, they are currently the best open source models.
Yes please. Command-R+ support is needed!
+1
Hi all, we've started investigating and implementing the cohere models. It is planned to be delivered in the 0.13 version.
Nice!
@syuoni Please also support these! https://huggingface.co/CohereForAI/c4ai-command-r-08-2024 https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024
@syuoni Please also support these! https://huggingface.co/CohereForAI/c4ai-command-r-08-2024 https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024
These models share the same architecture CohereForCausalLM, so it's very likely they will be supported automatically once command-r is ready.
Thank you!
Who knows how Cohere Family of Models are similar to other OSS model supported in TensorRTLLM?
@syuoni is the feature delayed? If I wanted to create a convertor from huggingface definition to tensorrtllm checkpoint myself, what document I should check?
It doesn't look like it is out, none of the recent commits mention it, but the version 0.13 was just released...
Hi all,
Yes. Cohere is postponed; it's not available in 0.13. Cohere has some structures quite different from LLaMA, e.g., qk_layernorm, so it took some extra time to align the accuracy.
The MR is ready and under review in our internal repo. I think it will be released soon. Thanks!
Hi all,
The Command-R and Aya models have been supported on main branch. See: https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/commandr
I'm closing this issue. Thanks!