Qwen2.5 icon indicating copy to clipboard operation
Qwen2.5 copied to clipboard

沿用qwen1的lora微调脚本,训练有问题;

Open fanbooo opened this issue 1 year ago • 6 comments

作者你好: 问题描述: 目前用qwen1.5的模型,复用了https://github.com/QwenLM/Qwen 的lora 微调代码,ds_config_zero2.json,在16张A800上DDP训练,尝试了model_max_length=2048和4096;一直报无法进入正常的loss计算,日志如下: image

搜到的一些解决方案: 看别人类似的问题回答是transformer版本问题,之前相同代码微调qwen1没有出现过这个,但是微调1.5不能降transformer版本吧,当前版本transformers==4.37.0,deepspeed>=0.9.3 image

请问这个问题怎么解决呢?能否开放一下DDP的代码?

fanbooo avatar Feb 22 '24 03:02 fanbooo

搜到的解决issue:https://github.com/LianjiaTech/BELLE/issues/134

fanbooo avatar Feb 22 '24 03:02 fanbooo

Try our provided one or use Axolotl or LLaMA-Factory

JustinLin610 avatar Feb 26 '24 15:02 JustinLin610

请问您的微调的数据集长什么样呀,我用的Qwen1的,但是报如下错误: Traceback (most recent call last): File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 375, in train() File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 348, in train data_module = make_supervised_data_module( File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 240, in make_supervised_data_module train_data.append(json.loads(line)) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

WangJianQ-0118 avatar Feb 29 '24 11:02 WangJianQ-0118

请问您的微调的数据集长什么样呀,我用的Qwen1的,但是报如下错误: Traceback (most recent call last): File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 375, in train() File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 348, in train data_module = make_supervised_data_module( File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 240, in make_supervised_data_module train_data.append(json.loads(line)) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

dataloader部分全部重写了,参照的是modelscope-agent-7b的数据格式

fanbooo avatar Mar 07 '24 06:03 fanbooo

请问您的微调的数据集长什么样呀,我用的Qwen1的,但是报如下错误: Traceback (most recent call last): File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 375, in train() File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 348, in train data_module = make_supervised_data_module( File "/root/wangjianqiang/Qwen/Qwen-main-1.5/finetune.py", line 240, in make_supervised_data_module train_data.append(json.loads(line)) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/root/anaconda3/envs/Qwen1.5-train/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

dataloader部分全部重写了,参照的是modelscope-agent-7b的数据格式

请问你这个问题解决了吗

suanfaxiaohuo avatar Mar 12 '24 13:03 suanfaxiaohuo