Ashwath Aithal

Results 10 comments of Ashwath Aithal

@meatybobby what is the status ? can we cross post the Automodel PR ?

@akoumpa is there a reason you dont want to use Muon from: https://github.com/NVIDIA-NeMo/Emerging-Optimizers

@sanjana-inflection can you please respond to the request

updating the status here from @yfw : This seems like a large model so we will most likely need to use mcore path for this. We recently just merged this...

@ZhiyuLi-Nvidia is this something you can review ?

@sharathts can you please take a look and opine

@yaoyu-33 @yfw can we get a review for this ?

@guyueh1 can we also add a large model like 70B ? @joyang-nv we also need FP8 policy in the Dtensor path. we should enable this after we move to Automodel...