oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

modify clip_grad with no to_global

Open hanwen-sun opened this issue 11 months ago • 23 comments

去掉clip_grad 范数计算中的第一个to_global, 以减少在tensor parallel情况下不必要的 all gather

hanwen-sun avatar Mar 11 '24 03:03 hanwen-sun