flash-linear-attention icon indicating copy to clipboard operation
flash-linear-attention copied to clipboard

Hello from HF Diffusers

Open sayakpaul opened this issue 1 year ago • 3 comments

Thanks for the incredibly clean repository!

I am Sayak from the Diffusers team at Hugging Face. My question is probably very naive, so I apologize for that in advance.

I wanted to know if linear attention could applied in inference time only? More precisely, can I take a model trained with regular attention and turn it into a linear attention model during inference?

sayakpaul avatar Aug 15 '24 08:08 sayakpaul