LLaMA-Factory 全量微调Qwen1.5模型报错 'weight' must e 2-D

全量微调Qwen1.5模型报错 'weight' must e 2-D

Open Leekinxun opened this issue 9 months ago • 0 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

训练脚本如下： CUDA_VISIBLE_DEVICES=1,2,3 python src/train.py
--deepspeed deepspeed/ds_config_zero3.json
--stage sft
--do_train
--model_name_or_path "/data02/pretrained_models/Qwen1.5-4B-Chat"
--dataset health_style
--template qwen
--finetuning_type full
--output_dir "save_models/Qwen1.5-4b_healthy"
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 100
--save_steps 56850
--learning_rate 3e-4
--num_train_epochs 150.0
--plot_loss
--ddp_timeout 180000000
--fp16

Expected behavior

报错：‘weight' must be 2-D

System Info

No response

Others

No response

May 10 '24 03:05 Leekinxun

LLaMA-Factory LLaMA-Factory copied to clipboard

全量微调Qwen1.5模型 报错 'weight' must e 2-D

Reminder

Reproduction

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard

全量微调Qwen1.5模型报错 'weight' must e 2-D