Ashwath Aithal comments

Results 10 comments of


                                            Ashwath Aithal

Add support for Embedding Model finetuning

@meatybobby what is the status ? can we cross post the Automodel PR ?

Support Muon + Dion optimizers

@akoumpa is there a reason you dont want to use Muon from: https://github.com/NVIDIA-NeMo/Emerging-Optimizers

fix: add tooltip, fix get wrong percentage, change cursor when not allowed

@sanjana-inflection can you please respond to the request

[Feature Request] Qwen3VL GRPO, SFT training

@yfw please opine

[Feature Request] Qwen3VL GRPO, SFT training

updating the status here from @yfw : This seems like a large model so we will most likely need to use mcore path for this. We recently just merged this...

system oom with qwen 235b

@ZhiyuLi-Nvidia is this something you can review ?

HF Conversion for On-policy Distillation Trained Models

@sharathts can you please take a look and opine

feat: Support mcore virtual pipeline parallel

@yaoyu-33 @yfw can we get a review for this ?

Hopper FP8 GRPO Recipe Productization

@guyueh1 can we also add a large model like 70B ? @joyang-nv we also need FP8 policy in the Dtensor path. we should enable this after we move to Automodel...

policy.train slow at >=32 nodes b/c workers start at different time

@katec846 please update the latest status