BS-RoFormer Linear Attention

Linear Attention

Open Psarpei opened this issue 8 months ago • 0 comments

hello, I have two question about the Linear Attention that was added later. Can you clear me up why it is called Linear Attention when the referenced paper introduces Cross-Covariance Attention and why exactly is it better than Self-Attention?

May 28 '24 11:05 Psarpei

BS-RoFormer BS-RoFormer copied to clipboard

Linear Attention

BS-RoFormer
BS-RoFormer copied to clipboard