TensorRT-LLM [Feature request] Cohere Family of Models (Command-R, Command-R-Plus, Aya23-8B, Aya23-35B, Aya101)

[Feature request] Cohere Family of Models (Command-R, Command-R-Plus, Aya23-8B, Aya23-35B, Aya101)

Open user-0a opened this issue 1 year ago • 12 comments

Hello,

I am creating this issue for the purpose of requesting support for the Cohere family of models:

Command-R: https://huggingface.co/CohereForAI/c4ai-command-r-v01 Command-R-Plus: https://huggingface.co/CohereForAI/c4ai-command-r-plus Aya23-8B: https://huggingface.co/CohereForAI/aya-23-8B Aya23-35B: https://huggingface.co/CohereForAI/aya-23-35B Aya101: https://huggingface.co/CohereForAI/aya-101

Thank you

May 23 '24 20:05 user-0a

I would also like to request support for CommandR and CommandR+ plus, they are currently the best open source models.

May 30 '24 05:05 imnoahcook

Yes please. Command-R+ support is needed!

Jun 04 '24 08:06 aikitoria

Jun 06 '24 13:06 here4dadata

Hi all, we've started investigating and implementing the cohere models. It is planned to be delivered in the 0.13 version.

Aug 21 '24 05:08 syuoni

Nice!

Aug 22 '24 14:08 aikitoria

@syuoni Please also support these! https://huggingface.co/CohereForAI/c4ai-command-r-08-2024 https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024

Aug 30 '24 15:08 aikitoria

@syuoni Please also support these! https://huggingface.co/CohereForAI/c4ai-command-r-08-2024 https://huggingface.co/CohereForAI/c4ai-command-r-plus-08-2024

These models share the same architecture CohereForCausalLM, so it's very likely they will be supported automatically once command-r is ready.

Aug 31 '24 04:08 syuoni

Thank you!

Sep 04 '24 21:09 user-0a

Who knows how Cohere Family of Models are similar to other OSS model supported in TensorRTLLM?

Sep 30 '24 21:09 salaki

@syuoni is the feature delayed? If I wanted to create a convertor from huggingface definition to tensorrtllm checkpoint myself, what document I should check?

Sep 30 '24 21:09 salaki

It doesn't look like it is out, none of the recent commits mention it, but the version 0.13 was just released...

Sep 30 '24 21:09 aikitoria

Hi all,

Yes. Cohere is postponed; it's not available in 0.13. Cohere has some structures quite different from LLaMA, e.g., qk_layernorm, so it took some extra time to align the accuracy.

The MR is ready and under review in our internal repo. I think it will be released soon. Thanks!

Oct 01 '24 06:10 syuoni

Hi all,

The Command-R and Aya models have been supported on main branch. See: https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/commandr

I'm closing this issue. Thanks!

Oct 17 '24 10:10 syuoni

TensorRT-LLM TensorRT-LLM copied to clipboard

[Feature request] Cohere Family of Models (Command-R, Command-R-Plus, Aya23-8B, Aya23-35B, Aya101)

TensorRT-LLM
TensorRT-LLM copied to clipboard