Liger-Kernel
Liger-Kernel copied to clipboard
Grouped Latent Attention
🚀 The feature, motivation and pitch
New work from Prof. Dao's lab that improves on Deepseek's original Multihead Latent Attention.
Relevant Paper: https://arxiv.org/pdf/2505.21487
Alternatives
No response
Additional context
No response