GOATnote
Results
1
issues of
GOATnote
Performance: - 52.1 TFLOPS on NVIDIA L4 (Ada, SM 8.9) - 1.74× faster than CUTLASS 4.3.0 baseline (~30 TFLOPS) - 63× faster than cuSPARSE (0.87 TFLOPS) - 83% efficiency vs...