LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

loss一直上升

Open ahsbdcpu opened this issue 1 year ago • 0 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

請問一下問甚麼我在做二次預訓練的時候中途loss值為突然性的上升,請問怎麼回事呢 從一開始的2.多跑到5.多

Expected behavior

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py --stage pt --do_train True --model_name_or_path meta-llama/Meta-Llama-3-8B --finetuning_type lora --template llama3 --flash_attn auto --use_unsloth True --dataset_dir data --dataset 公開健康資料集第一份,公開健康資料集第二份 --cutoff_len 1024 --learning_rate 3e-04 --num_train_epochs 2.0 --max_samples 100000 --per_device_train_batch_size 4 --gradient_accumulation_steps 8 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 100 --warmup_steps 0 --optim adamw_torch --packing True --report_to all --report_to none --output_dir saves/LLaMA3-8B/lora/PT醫療 --fp16 True --lora_rank 8 --lora_alpha 16 --lora_dropout 0 --save_steps 100 --eval_steps 100 --val_size 0.2 --evaluation_strategy steps --load_best_model_at_end True --lora_target q_proj,v_proj --plot_loss True

System Info

No response

Others

No response

ahsbdcpu avatar May 24 '24 01:05 ahsbdcpu