TransformerEngine icon indicating copy to clipboard operation
TransformerEngine copied to clipboard

[Feature Request] Grouped GEMM kernel

Open LiyuanLucasLiu opened this issue 1 year ago • 1 comments

Thanks for the awesome library! I'm wondering whether there are plans to provide ops support for grouped_gemm as in https://github.com/tgale96/grouped_gemm/tree/main

As of more information, it seems that fp8 is supported in cutlass grouped_gemm.

https://github.com/NVIDIA/cutlass/blob/main/examples/57_hopper_grouped_gemm/57_hopper_grouped_gemm.cu#L94

LiyuanLucasLiu avatar Feb 24 '24 21:02 LiyuanLucasLiu

A GroupedLinear layer has been added in TE v1.9, and it has FP8 support.

yaox12 avatar Sep 05 '24 03:09 yaox12