Qwen-VL icon indicating copy to clipboard operation
Qwen-VL copied to clipboard

💡 [REQUEST] - <title>怎么能够继续昨天的训练,继续训练

Open sunjunlishi opened this issue 10 months ago • 3 comments

起始日期 | Start Date

No response

实现PR | Implementation PR

No response

相关Issues | Reference Issues

No response

摘要 | Summary

昨天训练到checkpoint-1200,loss值是0.9 今天想继续昨天的训练,能解决吗? 不然loss值要从头2.7开始往下降,比较耗时。

基本示例 | Basic Example

model = transformers.AutoModelForCausalLM.from_pretrained( model_args.model_name_or_path, config=config, cache_dir=training_args.cache_dir, device_map=device_map, trust_remote_code=True, quantization_config=GPTQConfig( bits=4, disable_exllama=True ) if training_args.use_lora and lora_args.q_lora else None, )

缺陷 | Drawbacks

因为大家的资源有限,只能晚上训练。如果不能继续昨天的训练,从头再训练,比较耗时。

未解决问题 | Unresolved questions

No response

sunjunlishi avatar Apr 12 '24 03:04 sunjunlishi

--resume_from_checkpoint /path/to/your/checkpoint

tristanwqy avatar Apr 13 '24 22:04 tristanwqy

非常感谢

sunjunlishi avatar Apr 19 '24 10:04 sunjunlishi

你好,想问下怎么配置 finetune 环境呀?我使用 finetune.py 时 deepspeed 不能正常使用?方便的话可以回复一下吗?谢谢大佬~

Qinger27 avatar Apr 28 '24 02:04 Qinger27