peft add accelerate example for DDP and FSDP in sequence classification fo…

add accelerate example for DDP and FSDP in sequence classification fo…

Open sywangyi opened this issue 1 year ago • 1 comments

…r non-lora case

Apr 23 '23 05:04 sywangyi

@pacman100 please help review.

Apr 23 '23 05:04 sywangyi

The documentation is not available anymore as the PR was closed or merged.

May 03 '23 07:05 HuggingFaceDocBuilderDev

yes, @pacman100 I see memory decrease in FSDP. I finetune llama 7b with 2-GPUs (RTX8000) using p-tuning, if FSDP is not used, DDP will be crashed because of OOM if training batch size is set to 8, while no crash with fsdp. and if cpu offload is used, the memory will decrease more comparing with no cpu offload in FSDP. but you should apply 352 to use cpu offload in fsdp.

May 04 '23 02:05 sywangyi

peft peft copied to clipboard

add accelerate example for DDP and FSDP in sequence classification fo…

peft
peft copied to clipboard