InternLM-XComposer icon indicating copy to clipboard operation
InternLM-XComposer copied to clipboard

layer-wise learning rate

Open liuheng0111 opened this issue 3 months ago • 1 comments

请教一个问题,layer-wise learning rate 比如同一层的参数有Q、K、V、MLP等参数,同一层的参数学习率相同,还是按照从上到下的顺序,同一层的lr也会衰减,同一层的学习率也不同?

liuheng0111 avatar Mar 26 '24 03:03 liuheng0111

也在尝试复现XCompose的训练过程,感兴趣的可以一起交流

image

luohao123 avatar Apr 09 '24 09:04 luohao123