peft icon indicating copy to clipboard operation
peft copied to clipboard

add accelerate example for DDP and FSDP in sequence classification fo…

Open sywangyi opened this issue 1 year ago • 1 comments

…r non-lora case

sywangyi avatar Apr 23 '23 05:04 sywangyi

@pacman100 please help review.

sywangyi avatar Apr 23 '23 05:04 sywangyi

The documentation is not available anymore as the PR was closed or merged.

yes, @pacman100 I see memory decrease in FSDP. I finetune llama 7b with 2-GPUs (RTX8000) using p-tuning, if FSDP is not used, DDP will be crashed because of OOM if training batch size is set to 8, while no crash with fsdp. and if cpu offload is used, the memory will decrease more comparing with no cpu offload in FSDP. but you should apply 352 to use cpu offload in fsdp.

sywangyi avatar May 04 '23 02:05 sywangyi