LLaMA-Factory
LLaMA-Factory copied to clipboard
全量微调Qwen1.5模型 报错 'weight' must e 2-D
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
训练脚本如下:
CUDA_VISIBLE_DEVICES=1,2,3 python src/train.py
--deepspeed deepspeed/ds_config_zero3.json
--stage sft
--do_train
--model_name_or_path "/data02/pretrained_models/Qwen1.5-4B-Chat"
--dataset health_style
--template qwen
--finetuning_type full
--output_dir "save_models/Qwen1.5-4b_healthy"
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 100
--save_steps 56850
--learning_rate 3e-4
--num_train_epochs 150.0
--plot_loss
--ddp_timeout 180000000
--fp16
Expected behavior
报错:‘weight' must be 2-D
System Info
No response
Others
No response