NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

How to pretrain from scratch the Qwen 3 4B or 7B dense model (with my own data)?

Open tjoymeed opened this issue 6 months ago • 1 comments

Hi team,

Thanks for your excellent work!

How to pretrain from scratch the Qwen 3 4B or 7B dense model (with my own data)?

Architectural wise, they are the same as Qwen 2.5 7B model, so I could reuse the Qwen 2.5 7B model's pretraining recipe?

Am I right?

Or do you have the pretraining recipe for Qwen 3 4B or 7B base model LLM directly?

Thanks a lot!

tjoymeed avatar Jun 05 '25 21:06 tjoymeed

Hi @tjoymeed , we do have predefined Qwen3 4B model recipe here https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/llm/recipes/qwen3_4b.py

For using your own data, you need to modify the data component in the recipe (eg recipe.data = function_to_construct_new_data_module). See the Qwen3 documentation here: https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/qwen3.html

suiyoubi avatar Jun 13 '25 17:06 suiyoubi

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jul 14 '25 02:07 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jul 21 '25 02:07 github-actions[bot]