Hanze Dong comments

Results 7 comments of


                                            Hanze Dong

Load finetune model fails

BTW, can you try zero2 rather than zero3? zero3 often cause some problems due to the deepspeed

no config.json

Hi, I think there should be a config.json when training is finished. Can you double check the script content or whether it is successfully done?

Finetuning Chinese models

您好，我们是支持中文的，但由于LLaMA 预训练中文语料较少，做微调的时候可能需要更多高质量语料才能达到较好效果。如果您有比较好的数据的话可以尝试，我们也在做一些尝试和迭代，会跟大家一并分享。

Why is the lora_model so large after fine-tuning?

The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed.

Why is the lora_model so large after fine-tuning?

> > The checkpoints are intermediate full model. You can use the adapter_model.bin once the training is completed. > > How to extract adapter_model.bin file from the checkpoint folder? Due...

[BUG]Model size change

Given your model size, I think the precision may play a role. FP32 vs FP16/BF16?

[BUG] deepspeed.runtime.zero.utils.ZeRORuntimeException: You are using ZeRO-Offload with a client provided optimizer

Hi, it looks that the configuations of "ZeRO-Offload" is not correct, you may double check the yaml file. BTW, this might be more related to the configuration of deepspeed.