flash-attention Is there a way to use flash attention and selectively finetune only q projection layer, leaving k and v projection layers frozen?

Is there a way to use flash attention and selectively finetune only q projection layer, leaving k and v projection layers frozen?

Open yxchng opened this issue 1 year ago • 1 comments

Apr 01 '24 11:04 yxchng

You can just change the python interface (https://github.com/Dao-AILab/flash-attention/blob/main/flash_attn/flash_attn_interface.py) to set k_grad and v_grad to None and see if that works.

Apr 01 '24 15:04 tridao

flash-attention flash-attention copied to clipboard

Is there a way to use flash attention and selectively finetune only q projection layer, leaving k and v projection layers frozen?

flash-attention
flash-attention copied to clipboard