maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Plans for support of Qwen models?

Open shamuiscoding opened this issue 7 months ago • 2 comments

Are there plans to support of Qwen models in MaxText?

Aside from Gemma: Deepseek, Mistral and Llama are there, but Qwen seems to be missing.

shamuiscoding avatar May 12 '25 10:05 shamuiscoding

Hi, yes, we are looking into this. Are there particular Qwen variants you would want to see in MaxText ? Any context or motivation would be helpful.

shralex avatar May 12 '25 20:05 shralex

Awesome :) Qwen 3-4B and 8B would be my primary interest. I'm planning to use as a backbone for a speech model, through TPUs provided by the TRC. Bigger models would also be interesting, but will be too hefty to fine-tune.

shamuiscoding avatar May 13 '25 10:05 shamuiscoding

Yes, it is quite strange that Qwen is not included in the release, as it is a fairly popular and widely cited model. I would particularly like to see the latest Qwen3 and Qwen 3 MoE.

0x7o avatar Jun 11 '25 14:06 0x7o