LLaMA-Factory
LLaMA-Factory copied to clipboard
单机多卡 保存模型的时候忽然显存增加
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
deepspeed --num_gpus 2 src/train_bash.py
--deepspeed ds_config.json
--stage sft
--do_train
--model_name_or_path /home/zhouyu/pretrained_model/llm/Yi-34B-Chat
--dataset_dir data
--dataset alpaca_gpt4_zh
--template yi
--finetuning_type lora
--lora_target all
--output_dir output/yi-34-chat
--preprocessing_num_workers 32
--overwrite_cache
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 1
--save_steps 2
--eval_steps 2
--val_size 0.001
--learning_rate 1e-5
--num_train_epochs 3.0
--evaluation_strategy steps
--plot_loss
--bf16
Expected behavior
System Info
No response
Others
No response