请教如何调整微调参数

Open zhaoweihan2017 opened this issue 1 year ago • 0 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

背景：使用模型Qwen1.5-0.5B并在此模型基础上作微调训练。数据集：采用https://huggingface.co/datasets/michaelwzhu/ShenNong_TCM_Dataset，中医数据集训练参数：

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py
--stage sft
--do_train True
--model_name_or_path /data/workspace/Qwen1.5-0.5B
--finetuning_type lora
--template qwen
--flash_attn auto
--dataset_dir data
--dataset train_med_data_alpaca
--cutoff_len 512
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--report_to none
--output_dir saves/Qwen1.5-0.5B/lora/train_2024-05-09-13-52-med
--fp16 True
--lora_rank 8
--lora_alpha 16
--lora_dropout 0
--lora_target all
--plot_loss True

问题/现象：训练过程中模型一直不收敛，loss一直处于震荡状态。

5720F9B5-BA47-4bac-AE7B-231858AFE6FB

Expected behavior

希望能提供微调参数优化方案，使得模型最终微调效果较好。

System Info

No response

Others

No response

May 09 '24 02:05 zhaoweihan2017