peft
peft copied to clipboard
Adding Zero-Init Attention Adaptation
At some point, it may be worthwhile to add Zero-Init Attention Adaptation from the ArXiv preprint LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention.
This method also supports the addition of multi-modal instructions.
Efficiency Comparison:
Model | Parameters | Storage Space | Training Time |
---|---|---|---|
Alpaca | 7B | 13G | 3 Hours |
Alpaca-LoRA | - | 16.8M | - |
LLaMA-Adapter | 1.2M | 4.7M | 1 Hour |
The training code will be released here: https://github.com/ZrrSkywalker/LLaMA-Adapter
Hello, the paper is interesting and very insightful, we are already evaluating the feasibility of integrating it. Will update this issue accordingly. PRs are welcome for adding this 🤗.
Took a stab at it!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.