Liger-Kernel icon indicating copy to clipboard operation
Liger-Kernel copied to clipboard

[Operator] Fused Neighborhood Attention

Open AndreSlavescu opened this issue 6 months ago • 2 comments

Summary

https://github.com/linkedin/Liger-Kernel/issues/733

Testing Done

Tested Attention Layer and Attention module implementation for FusedNeighborhoodAttention

  • Hardware Type: 3090 & H100 SXM5
  • [x] run make test to ensure correctness
  • [x] run make checkstyle to ensure code style
  • [x] run make test-convergence to ensure convergence

AndreSlavescu avatar May 27 '25 18:05 AndreSlavescu

@Tcc0403 @lancerts @qingquansong @shivam15s

Eventual goal is to compare sparsity performance with sparse MTA.

I will try to get H100 benchmark numbers tonight.

AndreSlavescu avatar May 27 '25 18:05 AndreSlavescu

RTX 3090 numbers:

Fwd:

image

Bwd:

image

memory:

image

H100 SXM5 numbers:

Fwd:

image

Bwd:

image

memory:

image

AndreSlavescu avatar May 27 '25 19:05 AndreSlavescu

@AndreSlavescu Great work! Do you happen to have a sense of how the triton kernel impl compares to the reported numbers for their cutlass kernel implementation?

shimizust avatar Jun 05 '25 18:06 shimizust

@AndreSlavescu Great work! Do you happen to have a sense of how the triton kernel impl compares to the reported numbers for their cutlass kernel implementation?

I can do a more insightful benchmark on terms of FLOPs achieved and plot arithmetic intensity to compare. They report numbers relative to naive NA, but I will need to review the paper for exams FLOPs details.

AndreSlavescu avatar Jun 05 '25 21:06 AndreSlavescu

@AndreSlavescu Great work! Do you happen to have a sense of how the triton kernel impl compares to the reported numbers for their cutlass kernel implementation?

I can do a more insightful benchmark on terms of FLOPs achieved and plot arithmetic intensity to compare. They report numbers relative to naive NA, but I will need to review the paper for exams FLOPs details.

Yeah that would be great to see and put in the PR desc/release notes

shimizust avatar Jun 05 '25 22:06 shimizust