How to pretrain from scratch the Qwen 3 4B or 7B dense model (with my own data)?
Hi team,
Thanks for your excellent work!
How to pretrain from scratch the Qwen 3 4B or 7B dense model (with my own data)?
Architectural wise, they are the same as Qwen 2.5 7B model, so I could reuse the Qwen 2.5 7B model's pretraining recipe?
Am I right?
Or do you have the pretraining recipe for Qwen 3 4B or 7B base model LLM directly?
Thanks a lot!
Hi @tjoymeed , we do have predefined Qwen3 4B model recipe here https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/llm/recipes/qwen3_4b.py
For using your own data, you need to modify the data component in the recipe (eg recipe.data = function_to_construct_new_data_module). See the Qwen3 documentation here: https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/qwen3.html
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.