tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

2:4 Sparsity acceleration does not deliver any benefit.

Open Moritz-Tho123 opened this issue 9 months ago • 0 comments

When checking out the conclusion of the tutorial for 2:4 sparsity here, the claimed advantage of 2:4 sparsity over dense execution is given as 1.3x-2.0x. However, when checking the actual values that are output in the dense and sparse section terminal sections we get the following table:

bs compile Dense Sparse Speedup
4 n 9.56 16.77 0.57x
4 y 8.98 9.49 0.95x
16 n 31.86 62.27 0.51x
16 y 30.83 34.29 0.90x
64 n 123.97 243.16 0.51x
64 y 104.98 133.49 0.79x
256 n 476.03 1195.23 0.40x
256 y 397.13 542.3 0.73x

As can be seen, the sparse matrix computation does not beat the dense one even once. I rerun these experiments with torch 2.5.1+cu2.4 on a single H100 and observed similar results.

How come the values are this much worse?

Moritz-Tho123 avatar Jan 20 '25 16:01 Moritz-Tho123