Liger-Kernel
Liger-Kernel copied to clipboard
DeepSeek Native Sparse Attention (NSA) Kernel
🚀 The feature, motivation and pitch
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention https://arxiv.org/abs/2502.11089
Potentially useful python reference https://github.com/dhcode-cpp/NSA-pytorch
Alternatives
No response
Additional context
No response