LLaMA-Factory Llama pro , methods: Lora, Stage: orpo ,mudule expansed,did not work as expected ,Model llama3-8B-instruct , finetuned model infers not as well as original model

Llama pro , methods: Lora, Stage: orpo ,mudule expansed,did not work as expected ,Model llama3-8B-instruct , finetuned model infers not as well as original model

Open hzgdeerHo opened this issue 9 months ago • 1 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=0 llamafactory-cli train
--stage orpo
--do_train True
--model_name_or_path /home/ubuntu/LLaMA-Factory/models/llama3-8b-instruct-pro
--finetuning_type lora
--template llama3
--flash_attn auto
--dataset_dir data
--dataset comparison_gpt4_zh
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--packing False
--report_to none
--output_dir saves/Custom/lora/train_2024-05-12-19-09-46
--fp16 True
--lora_rank 8
--lora_alpha 16
--lora_dropout 0
--lora_target all
--additional_target all
--orpo_beta 0.1
--plot_loss True ###modified datasets The loss result was 0.06

Expected behavior

May improve the original models performance on specific fields' knowledge while keep the original common performace.

System Info

No response

Others

No response

May 12 '24 11:05 hzgdeerHo

Thanks, your great projects is so helpful to save so much time for finetune !!!

May 12 '24 12:05 hzgdeerHo

Any solutions ? THANKS！！！

May 13 '24 09:05 hzgdeerHo

Using better datasets instead of the default one, and be caution of the overfitting

May 13 '24 09:05 hiyouga

Thanks！

May 13 '24 11:05 hzgdeerHo

LLaMA-Factory LLaMA-Factory copied to clipboard

Llama pro , methods: Lora, Stage: orpo ,mudule expansed,did not work as expected ,Model llama3-8B-instruct , finetuned model infers not as well as original model

Reminder

Reproduction

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard