Paddle
Paddle copied to clipboard
[DCU] New features for LLM
PR Category Performance Optimization
PR Types New features
Description 支持flash attention(mha,gqa前反向,单测通过) 支持a8w8相关算子(单测通过) 支持quant_linear相关算子(单测通过) 支持fused rope相关算子(单测通过) 支持multiclass_nms3 op(单测通过) 支持batch norm调用miopen(FLAGS_batch_norm_use_miopen=1使能,v1,v2单测通过) 支持gemm fp16计算类型(FLAGS_gemm_use_half_precision_compute_type=1使能)
你的PR提交成功,感谢你对开源项目的贡献! 请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.