NeMo
NeMo copied to clipboard
Add LoRA support to all linear layers
What does this PR do ?
Add LoRA support to all linear transformations in Attention and MLP modules in NeMo and MCore GPT models.
Collection: [Note which collection this PR will affect] nlp
Usage
To add LoRa to specific layer, just add the target_modules parameter to lora_tuning. The available configuration are below:
lora_tuning:
target_modules: ['attention_qkv'] # default, adds LoRA to QKV projection layer
target_modules: ['attention_qkv','attention_dense'] # adds LoRA to QKV projection and Dense layers attention
target_modules: ['mlp_fc1','mlp_fc2']# adds LoRA to linear layers in the MLP module
target_modules: ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2'] # adds LoRA to every linear layer in attention and MLP modules
There are also shortcut options defined as below:
lora_tuning:
target_modules: ['attention'] # same as ['attention_qkv','attention_dense']
target_modules: ['mlp'] # same as ['mlp_fc1','mlp_fc2']
target_modules: ['attention','mlp'] # same as ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2']
target_modules: ['all'] # same as ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2']
If the target_modules parameter is not provided by the user, the default value is
target_modules: ['attention_qkv']
PR Type:
- [x] New Feature
- [ ] Bugfix
- [ ] Documentation
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
@arendu
Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information
- Related to # (issue)
jenkins
jenkins
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
i think CI issue is blocked by #8342
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins
jenkins