flux icon indicating copy to clipboard operation
flux copied to clipboard

[QUESTION] Is it possible to use splitK kernel in AG mode to overlap comm and gemm?

Open conghaobian opened this issue 7 months ago • 1 comments

  • I am using flux v1.0.4 to achieve overlap of gemm and comm.In AG mode,I see flux use streamK kernel based on cutlass.
  • But on my gpu,splitK kernel performs better,so i want to use splitK kernels instead of streamK kernels.

So can u tell me if it is possible to achieve overlap of comm and gemm with splitK kernels?

conghaobian avatar May 14 '25 12:05 conghaobian

Yes, it's possible in theory, but not implemented.

I don't know that split-k is faster than stream-k. Can you provide some cases where split-k is faster than stream-k?

houqi avatar May 29 '25 03:05 houqi