Wangzheee

Results 5 issues of Wangzheee

https://github.com/mnicely/cublasLt_examples/blob/bb82ffefbcb8d4f54e7aa101ba7a25f93dafabc1/cublasLt_INT8_TCs.cu#L85 Should be CUBLAS_COMPUTE_32I

增加模型列表

### PR Category ### PR Types ### Description

### PR Category Performance Optimization ### PR Types New features ### Description pcard-71500 1. Add cutlass kernel for **fp8_fp8_half_gemm_fused**(fuse gemm+bias+scale+act) 2. Add api, phi_kernel of **fp8_fp8_fp8_dual_gemm_fused**(fuse gemm+gemm+swiglu) 3. Add cutlass...