Liger-Kernel icon indicating copy to clipboard operation
Liger-Kernel copied to clipboard

gemm fp8 e4m3

Open AndreSlavescu opened this issue 5 months ago • 3 comments

Summary

Implemented FP8 gemm with E4M3 representation for FP8.

Issue #65

Testing Done

tested square matrices of varying sizes (64, 256, 512, 1024, 2048) + non-square matrices of varying sizes and compared against torch matmul with appropriate casting for backward (torch.matmul doesn't support fp8_e4m3 dtype for backward).

FP8 gemm will only work on SM_89+

  • Hardware Type: RTX 4090
  • [x] run make test to ensure correctness
  • [x] run make checkstyle to ensure code style
  • [x] run make test-convergence to ensure convergence

AndreSlavescu avatar Aug 31 '24 08:08 AndreSlavescu