keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

I want to help Keras Hub be compatible with the Qwen and Interlm models. Before that, I have some questions I'd like to ask.

Open pass-lin opened this issue 11 months ago • 1 comments

Qwen and interlm is chinese llm Their architectures are basically the same as Llama3, with just a few more biases in the attention layer. If I want to support them in Keras, should I create a new class called QwenCasualModel, or simply add a few more configuration options to the existing Llama3? Additionally, is it possible for me not to provide a Kaggle link, but instead convert the weights directly through HF?

pass-lin avatar Nov 11 '24 06:11 pass-lin