InternLM-XComposer icon indicating copy to clipboard operation
InternLM-XComposer copied to clipboard

请问要是中断训练,想继续的话需要加什么参数呢

Open yimuu opened this issue 1 year ago • 3 comments

训练中间一个step断掉了,可以从这个step继续训练吗

yimuu avatar Aug 14 '24 01:08 yimuu

You may modify the code and set resume_from_checkpoint =True in the TrainingArguments class.

yuhangzang avatar Aug 16 '24 10:08 yuhangzang

#423 I found resume_from_checkpoint =True has its own issue with LoRA training, the LOSS restarts itself. Not sure whether you guys got a similar issue. @yuhangzang

YerongLi avatar Aug 18 '24 13:08 YerongLi

I find there is a problem saving checkpoints with 2d5-7b, while internlm-xcomposer2-vl-7b can saves checkpoint correctly in different settings. #423 #426

YerongLi avatar Aug 19 '24 13:08 YerongLi