benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

Add optional flag_gems support

Open xuzhao9 opened this issue 1 year ago • 0 comments

Import optional Triton kernels FlagGems: https://github.com/FlagOpen/FlagGems. Support softmax and addmm operators.

Test plan:

$ python run_benchmark.py triton --op addmm --only flaggems,triton_addmm --num-inputs 2 --metrics latency,gbps,tflops
         (M, N, K)    flaggems-gbps    flaggems-latency    flaggems-tflops    triton_addmm-gbps    triton_addmm-latency    triton_addmm-tflops
------------------  ---------------  ------------------  -----------------  -------------------  ----------------------  ---------------------
(20120, 512, 1536)          220.794            0.473686            66.808               234.791                0.445449                71.043
(34579, 512, 1536)          224.696            0.79493             68.4186              231.003                0.773224                70.3393

xuzhao9 avatar Oct 06 '24 00:10 xuzhao9