NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Add LoRA support to all linear layers

Open ertkonuk opened this issue 1 year ago • 22 comments

What does this PR do ?

Add LoRA support to all linear transformations in Attention and MLP modules in NeMo and MCore GPT models.

Collection: [Note which collection this PR will affect] nlp

Usage

To add LoRa to specific layer, just add the target_modules parameter to lora_tuning. The available configuration are below:

lora_tuning:
      target_modules: ['attention_qkv'] # default, adds LoRA to QKV projection layer
      target_modules: ['attention_qkv','attention_dense'] # adds LoRA to QKV projection and Dense layers attention
      target_modules: ['mlp_fc1','mlp_fc2']# adds LoRA to linear layers in the MLP module
      target_modules: ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2'] # adds LoRA to every linear layer in attention and MLP modules

There are also shortcut options defined as below:

lora_tuning:
      target_modules: ['attention'] # same as ['attention_qkv','attention_dense']
      target_modules: ['mlp'] # same as  ['mlp_fc1','mlp_fc2']
      target_modules: ['attention','mlp'] # same as ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2']
      target_modules: ['all'] # same as ['attention_qkv','attention_dense','mlp_fc1','mlp_fc2']

If the target_modules parameter is not provided by the user, the default value is

 target_modules: ['attention_qkv']

PR Type:

  • [x] New Feature
  • [ ] Bugfix
  • [ ] Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

@arendu

Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

ertkonuk avatar Dec 07 '23 00:12 ertkonuk

jenkins

ericharper avatar Dec 11 '23 20:12 ericharper

jenkins

arendu avatar Dec 21 '23 01:12 arendu

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar Jan 04 '24 01:01 github-actions[bot]

This PR was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jan 11 '24 01:01 github-actions[bot]

jenkins

arendu avatar Jan 16 '24 19:01 arendu

jenkins

arendu avatar Jan 31 '24 18:01 arendu

jenkins

arendu avatar Feb 05 '24 21:02 arendu

jenkins

arendu avatar Feb 05 '24 23:02 arendu

jenkins

arendu avatar Feb 06 '24 03:02 arendu

jenkins

arendu avatar Feb 06 '24 08:02 arendu

jenkins

arendu avatar Feb 06 '24 16:02 arendu

jenkins

HeyyyyyyG avatar Feb 07 '24 02:02 HeyyyyyyG

jenkins

cuichenx avatar Feb 07 '24 21:02 cuichenx

jenkins

HeyyyyyyG avatar Feb 08 '24 05:02 HeyyyyyyG

jenkins

cuichenx avatar Feb 09 '24 01:02 cuichenx

jenkins

HeyyyyyyG avatar Feb 09 '24 17:02 HeyyyyyyG

i think CI issue is blocked by #8342

cuichenx avatar Feb 12 '24 21:02 cuichenx

jenkins

cuichenx avatar Feb 12 '24 22:02 cuichenx

jenkins

cuichenx avatar Feb 13 '24 17:02 cuichenx

jenkins

arendu avatar Feb 14 '24 19:02 arendu

jenkins

cuichenx avatar Feb 14 '24 22:02 cuichenx

jenkins

cuichenx avatar Feb 15 '24 21:02 cuichenx

jenkins

cuichenx avatar Feb 20 '24 15:02 cuichenx

jenkins

cuichenx avatar Feb 20 '24 19:02 cuichenx

jenkins

HeyyyyyyG avatar Feb 21 '24 00:02 HeyyyyyyG