ZhiyuLi-Nvidia

Results 4 issues of ZhiyuLi-Nvidia

> [!IMPORTANT] > The `Update branch` button must only be pressed in very rare occassions. > An outdated branch is never blocking the merge of a PR. > Please reach...

> [!IMPORTANT] > The `Update branch` button must only be pressed in very rare occassions. > An outdated branch is never blocking the merge of a PR. > Please reach...

TTS
NLP
Multi Modal

# What does this PR do ? Successful run after the fix with tp2 sq enabled in qwen model: https://wandb.ai/nvidia/grpo-dev-zhiyul/runs/nyq6n98w/overview?nw=nwuserzhiyul # Issues List issues that this PR closes ([syntax](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)): #...

[muon optimizer](https://github.com/KellerJordan/Muon?tab=readme-ov-file) has attracted lots of interests in the community and is currently WIP in mcore. Also, it has been reported the model performance is even better if the same...

algorithm
community-request