flash-linear-attention
flash-linear-attention copied to clipboard
Hello from HF Diffusers
Thanks for the incredibly clean repository!
I am Sayak from the Diffusers team at Hugging Face. My question is probably very naive, so I apologize for that in advance.
I wanted to know if linear attention could applied in inference time only? More precisely, can I take a model trained with regular attention and turn it into a linear attention model during inference?