Hanze Dong

Results 7 comments of Hanze Dong

BTW, can you try zero2 rather than zero3? zero3 often cause some problems due to the deepspeed

Hi, I think there should be a config.json when training is finished. Can you double check the script content or whether it is successfully done?

您好,我们是支持中文的,但由于LLaMA 预训练中文语料较少,做微调的时候可能需要更多高质量语料才能达到较好效果。如果您有比较好的数据的话可以尝试,我们也在做一些尝试和迭代,会跟大家一并分享。

The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed.

> > The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed. > > How to extract adapter_model.bin file from the checkpoint folder? Due...

Given your model size, I think the precision may play a role. FP32 vs FP16/BF16?

Hi, it looks that the configuations of "ZeRO-Offload" is not correct, you may double check the yaml file. BTW, this might be more related to the configuration of deepspeed.