Paddle
Paddle copied to clipboard
Support float8 gemm_fused of cutlass
PR Category
Performance Optimization
PR Types
New features
Description
pcard-71500
- Add cutlass kernel for fp8_fp8_half_gemm_fused(fuse gemm+bias+scale+act)
- Add api, phi_kernel of fp8_fp8_fp8_dual_gemm_fused(fuse gemm+gemm+swiglu)
- Add cutlass kernel for fp8_fp8_fp8_dual_gemm_fused