optimum-habana
optimum-habana copied to clipboard
enable prompt tuning/prefix tuning/p tuning clm and example
What does this PR do?
Fixes # (issue)
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you make sure to update the documentation with your changes?
- [ ] Did you write any new necessary tests?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
@libinta @regisss please help review. prompt tunning is just a first step, next I will enable prefix-tuning and ptuning based on the example
prompt tunning, prefix-tuning and ptuning is enabled for llama finetune and inference
inference part needs model change to adapt to prompt tuning/prefix tuning, since virtual token is used in prompt/p tuning and past_key_value is extended in prefix tuning.
also upgrade the peft to latest tag 0.9.0
prompt tunning+deepspeed zero3 issue fixed by https://github.com/huggingface/peft/pull/1591
find deepspeed zero3+prompt tunning hang after saving checkpoint. fix by https://github.com/huggingface/transformers/pull/29980. will port similar change to optimum habana trainer.
@sywangyi I just merged #809, can you merge it into this branch and solve the merge conflicts please?
@sywangyi Why calling the new arg --no_ignore_eos
and not simply --ignore_eos
as everywhere else in the codebase? If you just want to have it true by default, I think this arg should rather be a bool with the default value True. It's clearer than an arg "no_ignore_eos" with store_false
.
@regisss arg bool does not work, see https://stackoverflow.com/questions/15008758/parsing-boolean-values-with-argparse. anyway, I picked up one solution to fix it.
I pushed a new commit where the logic is simpler. It only works with Python 3.9+, which is fine as Habana Docker images don't support Ubuntu 20.04 anymore.