Haicheng Wu
Haicheng Wu
The one in the mainloop is redundant, but it does not hurt anything. It has to be `One` in the mainloop.
@alihassanijr , could you please take a look? CC += @richardmcai
@thakkarV @ANIKET-SHIVAM
@jackkosaian
cc += @alexsamardzic @IwakuraRein
@LucasWilkinson , we upstreamed our change to groupwise scaling kernels. there are some conflicts in this PR that needs to be solved. Our change is mainly: ``` Extend groupwise scaling...
@itramble , could you please review first?
sorry, seems gtc pulled off the old ones.