peft icon indicating copy to clipboard operation
peft copied to clipboard

Skip Lora layers with zero scale

Open bonlime opened this issue 8 months ago • 2 comments

Hi, i know that this fix may seem a little random and maybe it would make sense to add similar lines to all other adapters, i just don't have time to do that. But i noticed that currently even if you have many adapters disabled (with zero weights) we still do matmul for them which is useless. On my local machine this change gives extra +2-3% speedup for each zero scaled lora. In case when there are lots of them loaded, this has noticeable effect

bonlime avatar Mar 27 '25 15:03 bonlime

Thanks for the PR. Could you please give us a bit more context on why you have scalings with 0? Note that there is already a check for disabled adapters, like here:

https://github.com/huggingface/peft/blob/7279a9ff2e661be90fe6af76e5653f8edb7ecf01/src/peft/tuners/lora/layer.py#L718-L733

So if your goal is to disable a whole adapter, that part of the code should never be reached. If you just have a handful of layers with scale 0, while the rest of the adapter is still active as normal on the other layers, I would instead suggest to define target_modules such that those 0 scale adapters are skipped, i.e. they don't have LoRA layers at all.

The only use case where I could see your optimization applying is if I want to dynamically deactivate a couple of layers of my LoRA adapter.

We have to mindful that even if these kinds of optimizations look harmless, adding this value dependent control flow can create issues with torch.compile, so we shouldn't add them without good testing.

BenjaminBossan avatar Mar 27 '25 15:03 BenjaminBossan

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Apr 26 '25 15:04 github-actions[bot]