LLaMA-Factory
LLaMA-Factory copied to clipboard
Llama pro , methods: Lora, Stage: orpo ,mudule expansed,did not work as expected ,Model llama3-8B-instruct , finetuned model infers not as well as original model
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train
--stage orpo
--do_train True
--model_name_or_path /home/ubuntu/LLaMA-Factory/models/llama3-8b-instruct-pro
--finetuning_type lora
--template llama3
--flash_attn auto
--dataset_dir data
--dataset comparison_gpt4_zh
--cutoff_len 1024
--learning_rate 5e-05
--num_train_epochs 3.0
--max_samples 100000
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--lr_scheduler_type cosine
--max_grad_norm 1.0
--logging_steps 5
--save_steps 100
--warmup_steps 0
--optim adamw_torch
--packing False
--report_to none
--output_dir saves/Custom/lora/train_2024-05-12-19-09-46
--fp16 True
--lora_rank 8
--lora_alpha 16
--lora_dropout 0
--lora_target all
--additional_target all
--orpo_beta 0.1
--plot_loss True
###modified datasets
The loss result was 0.06
Expected behavior
May improve the original models performance on specific fields' knowledge while keep the original common performace.
System Info
No response
Others
No response
Thanks, your great projects is so helpful to save so much time for finetune !!!
Any solutions ? THANKS!!!
Using better datasets instead of the default one, and be caution of the overfitting
Thanks!