GaLore
GaLore copied to clipboard
Why not reproject the internal Adam states during update_proj_gap?
Hi, great project. After reading the paper and the implementation, I am wondering if it is considered to reproject the Adam internal states (exp_avg, exp_avg_sq) from previous subspace to the new subspace?