binmakeswell
binmakeswell
Hi @gouchangjiang Did you solve it? This issue was closed due to inactivity. Thanks.
用户反馈:目前ColossalAI在如何保存(和加载)混合并行模式下(zero3+tensor+流水线)的优化器和lr调度器参数还没有一个完整的example,这个特性对宕机后从最后一个checkpoint重启训练很重要,可不可以麻烦工作人员补充一下demo。
We have completed most of the related Checkpoint development and are doing the final polishing and refinement. Thanks.
Thanks for the awsome work!
This issue was closed due to this is a wrong usage. Thanks.
We have updated a lot. This issue was closed due to inactivity. Thanks.
We have updated a lot. This issue was closed due to inactivity. Thanks.
We have updated a lot. This issue was closed due to inactivity. Thanks.
We have updated a lot. This issue was closed due to inactivity. Thanks.
Hi @Tron1994 We have updated a lot. You can check our new example. https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/gpt This issue was closed due to inactivity. Thanks.