Xiangning Chen

Results 21 comments of Xiangning Chen

Hi, what is the model size in your setting? When the model is small, I think the main memory overhead comes from the activation, so the saved second moment may...