DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[BUG]

Open XZhang97666 opened this issue 2 years ago • 2 comments

export CUDA_VISIBLE_DEVICES=2,3

task=medqa_usmle_hf datadir=data/$task outdir=runs/$task/GPT2 mkdir -p $outdir seed=42

deepspeed --num_gpus 2 --num_nodes 1 run_multiple_choice.py --tokenizer_name stanford-crfm/pubmed_gpt_tokenizer --model_name_or_path "stanford-crfm/BioMedLM"
--train_file ../../../SLMReason/data/MedQA/BertMC/train.json --validation_file ../../../SLMReason/data/MedQA/BertMC/validation.json
--test_file ../../../SLMReason/data/MedQA/BertMC/test.json --do_train --do_eval --do_predict --per_device_train_batch_size 1
--per_device_eval_batch_size 1 --gradient_accumulation_steps 32
--learning_rate 2e-6 --warmup_ratio 0.5 --num_train_epochs 10 --max_seq_length 512 --seed $seed --data_seed $seed --logging_first_step --logging_steps 20
--save_strategy no --evaluation_strategy steps --eval_steps 500 --run_name debug
--output_dir trash/
--overwrite_output_dir
--deepspeed ds_config_zero3.json
--fp16

XZhang97666 avatar Mar 21 '23 21:03 XZhang97666

What is the error that you're seeing?

Trying to specify which CUDA devices to use?

loadams avatar Apr 13 '23 20:04 loadams

FYI CUDA_VISIBLE_DEVICES does not work with the deepspeed launcher: image https://www.deepspeed.ai/getting-started/

mrwyattii avatar Apr 13 '23 23:04 mrwyattii

This looks to be a duplicate of #3070, answered there as well.

loadams avatar Apr 18 '23 23:04 loadams