Xiangning Chen
Results
21
comments of
Xiangning Chen
Hi, what is the model size in your setting? When the model is small, I think the main memory overhead comes from the activation, so the saved second moment may...