peft
peft copied to clipboard
Skip Lora layers with zero scale
Hi, i know that this fix may seem a little random and maybe it would make sense to add similar lines to all other adapters, i just don't have time to do that. But i noticed that currently even if you have many adapters disabled (with zero weights) we still do matmul for them which is useless. On my local machine this change gives extra +2-3% speedup for each zero scaled lora. In case when there are lots of them loaded, this has noticeable effect