stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

Update requirements.txt

Open tpoisonooo opened this issue 2 years ago • 2 comments

I got this error

python3.10/site-packages/transformers-4.27.0.dev0-py3.10.egg/transformers/trainer.py", line 1460, in _wrap_model
    self.model = model = FSDP(
TypeError: FullyShardedDataParallel.__init__() got an unexpected keyword argument 'forward_prefetch'

torch1.12 does not support forward_prefetch .

So I switch torch to 1.13.1, It works !

tpoisonooo avatar Mar 23 '23 14:03 tpoisonooo

HI, @tpoisonooo

Are you able to run fine-tuning successfully?

akanyaani avatar Mar 28 '23 07:03 akanyaani

HI, @tpoisonooo

Are you able to run fine-tuning successfully?

Yes here is my version table and run.sh on A100

repo commit-id
stanford_alpaca eb5b171d9b103a12a8e14e0edca9cbc45fe1d512
transformers 68d640f7c368bcaaaecfc678f11908ebbd3d6176
torch 1.13.1
$ cat run.sh
CUDA_VISIBLE_DEVICES=7,6,5,4,2,1,3,0 torchrun --nproc_per_node=4 --master_port=9999 train.py \
    --model_name_or_path ${PATH_TO_LLAMA}  \
    --data_path ./alpaca_data.json \
    --bf16 True \
    --output_dir ${OUTPUT_PATH} \
    --num_train_epochs 3 \
    --per_device_train_batch_size 2 \
    --per_device_eval_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 2000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \
    --tf32 True

And you can also try alpaca_lora + transformers master branch for saving memory.

tpoisonooo avatar Apr 03 '23 05:04 tpoisonooo