Wang, Yi
Wang, Yi
@pacman100 please help review.
yes, @pacman100 I see memory decrease in FSDP. I finetune llama 7b with 2-GPUs (RTX8000) using p-tuning, if FSDP is not used, DDP will be crashed because of OOM if...
@abhilash1910 @muellerzr please help review
@muellerzr @sgugger I have updated the PR according to the discussion, please help review it.
@libinta @regisss please help review. prompt tunning is just a first step, next I will enable prefix-tuning and ptuning based on the example
prompt tunning, prefix-tuning and ptuning is enabled for llama finetune and inference
inference part needs model change to adapt to prompt tuning/prefix tuning, since virtual token is used in prompt/p tuning and past_key_value is extended in prefix tuning.
also upgrade the peft to latest tag 0.9.0
prompt tunning+deepspeed zero3 issue fixed by https://github.com/huggingface/peft/pull/1591