oumi icon indicating copy to clipboard operation
oumi copied to clipboard

[Feature][Config] Add QwQ model configs

Open wizeng23 opened this issue 9 months ago • 6 comments

Feature request

32B param reasoning model developed by Alibaba's Qwen team. Would be good to add training/evaluation/inference configs.

Motivation / references

https://huggingface.co/Qwen/QwQ-32B-Preview

Your contribution

If somebody can volunteer to start this work, I can answer questions and help with testing.

wizeng23 avatar Feb 07 '25 09:02 wizeng23

Hi, I would like to work on this issue. I understand that I need to add new config files for QwQ under /configs/recipes. I have gone through the existing config files for deepseek distill models and the gpt2 model. There evaluation configs share some common attributes like model, generation and tasks but I also noticed some more attribute like gpt2 eval config has "polling_interval" and "output_dir" which were not present in others. To understand it better, I am going through the doc on evaluation config. Is there any more documentations I should go through to better understand the config structure? Thanks

HelixY2J avatar Feb 26 '25 18:02 HelixY2J

Thanks for your interest @HelixY2J ! The GPT2 config actually uses a different config class, AsyncEvaluationConfig, which contains some different parameter values than the other eval configs. It has the wrong documentation, which I will fix. I'd recommend looking at the eval configs for the Llama models (ex configs/recipes/llama3_1/evaluation/8b_eval.yaml) as a good reference point. As for documentation pages, you can take a look at https://oumi.ai/docs/en/latest/user_guides/evaluate/evaluate.html. There are many evaluation docs in the sidebar. A useful page to understand the config structure is https://oumi.ai/docs/en/latest/user_guides/evaluate/evaluation_config.html.

wizeng23 avatar Feb 26 '25 21:02 wizeng23

Aha noted, I will look into the Llama model configs as well as the docs you have shared. Thanks a lot

HelixY2J avatar Feb 26 '25 23:02 HelixY2J

Hey @HelixY2J, QwQ just released hours ago, and since it performs very well as a reasoning model, I'm adding the configs ASAP to support training. My apologies for commandeering this bug! If you're still willing to help with this task, LMK, and I can point to remaining work that can still be done.

wizeng23 avatar Mar 05 '25 22:03 wizeng23

Hey there, sorry for the delay on my end, I took more time than expected. And yes I would like to work on any remaining tasks. Could you point me to the areas that still need work ?

HelixY2J avatar Mar 06 '25 08:03 HelixY2J

It would be great if you could try running training or evaluation to confirm that they work well, and to find optimal hyperparameters for training. Reproducing their reported benchmark results with evaluation would also be interesting.

wizeng23 avatar Mar 06 '25 22:03 wizeng23