InternLM-XComposer
InternLM-XComposer copied to clipboard
[Implementation details] layer wise learning rate decay.
In the paper, you mention learning rate decay is important. I would like to know how could I set layer decay in the code. Do you have plans to release the code of learning rate decay.