DeepSpeedExamples
DeepSpeedExamples copied to clipboard
DeepSpeed-Chat step-1 hanging for a long time
deepspeed --hostfile ~/hostfile
--num_gpus 4
--num_nodes 2
--master_addr 172.16.4.41
main.py
--data_path Dahoas/rm-static
--data_split 2,4,4
--model_name_or_path shakechen/Llama-2-7b-hf/
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--max_seq_len 512
--learning_rate 9.65e-6
--weight_decay 0.
--num_train_epochs 1
--gradient_accumulation_steps 1
--lr_scheduler_type cosine
--num_warmup_steps 0
--seed 1234
--gradient_checkpointing
--zero_stage 3
--deepspeed
--output_dir /home/bingxing2/home/scx7avs/Deepspeed/output/