Juanxi Tian issues

Repositories
Issues
Comments

Results 4 issues of


                                            Juanxi Tian

Add the MTMD model on Alpha360

Add the MTMD model to the main branch

documentation

waiting for triage

Update & Supplement with new custom optimizer

Distributed improvement of Muon implementation

# What does this PR do? The distributed training of Muon was carefully considered. 1. Distributed Training Support: Added gradient synchronization via reduce_scatter_tensor and parameter updates via all_gather_into_tensor for proper...

ScalingOpt | Welcome to join the Optimization Community!

ScalingOpt is a professional platform focusing on optimization for large-scale deep learning, aiming to advocate for "Optimization at Scale," which means verifiable and scalable optimization algorithms. This community platform is...