Hanze Dong
Hanze Dong
BTW, can you try zero2 rather than zero3? zero3 often cause some problems due to the deepspeed
Hi, I think there should be a config.json when training is finished. Can you double check the script content or whether it is successfully done?
您好,我们是支持中文的,但由于LLaMA 预训练中文语料较少,做微调的时候可能需要更多高质量语料才能达到较好效果。如果您有比较好的数据的话可以尝试,我们也在做一些尝试和迭代,会跟大家一并分享。
The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed.
> > The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed. > > How to extract adapter_model.bin file from the checkpoint folder? Due...
Given your model size, I think the precision may play a role. FP32 vs FP16/BF16?
Hi, it looks that the configuations of "ZeRO-Offload" is not correct, you may double check the yaml file. BTW, this might be more related to the configuration of deepspeed.