torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Support Qwen3

Open pocca2048 opened this issue 8 months ago • 3 comments

Qwen3 only deletes bias and adds qk_norm (like in gemma3). So it should be straightforward.

pocca2048 avatar Apr 29 '25 04:04 pocca2048

Hi @pocca2048 thanks for creating the issue. It would be great to support this in torchtune. Do you have any interest in opening a PR? We would be happy to provide guidance on the implementation, you can take a look at the Qwen 2.5 PR as a reference: #1863

ebsmothers avatar Apr 30 '25 01:04 ebsmothers

Unfortunately, I don’t have the time to work on this at the moment... 😢 If it’s still unclaimed when I have some availability, I’d be happy to give it a try.

pocca2048 avatar Apr 30 '25 07:04 pocca2048

I've created a draft PR with only the model builders (no recipes added yet): https://github.com/pytorch/torchtune/pull/2669

If someone can help review this, I can work on adding the recipes and verifying them. Just need some help to check if I'm on the right track with this.

prvnsmpth avatar May 03 '25 10:05 prvnsmpth