PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

add fast_rmsnorm

Open deepllz opened this issue 1 year ago • 2 comments

PR types

Performance optimization

PR changes

Others

Description

基于fast_ln,支持了fast_rms_norm。 使得rms_norm算子速度提升了1倍,模型吞吐如下:

模型 并行策略 pr前吞吐 pr后吞吐
Llama-2 7B gbs8, sharding8-mbs1-acc1 4454.693 4490.384
Llama-2 13B gbs8, pp4sharding2-vpp5-mbs1-acc4 2229.921 2252.541

开关use_fast_layer_norm能够诸位对齐

deepllz avatar Jun 28 '24 02:06 deepllz

Thanks for your contribution!

paddle-bot[bot] avatar Jun 28 '24 02:06 paddle-bot[bot]

Codecov Report

Attention: Patch coverage is 22.22222% with 7 lines in your changes missing coverage. Please review.

Project coverage is 55.74%. Comparing base (c574d6d) to head (0a7af50). Report is 222 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/llama/fusion_ops.py 25.00% 6 Missing :warning:
paddlenlp/transformers/llama/modeling.py 0.00% 1 Missing :warning:
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8680   +/-   ##
========================================
  Coverage    55.74%   55.74%           
========================================
  Files          623      623           
  Lines        97454    97457    +3     
========================================
+ Hits         54323    54331    +8     
+ Misses       43131    43126    -5     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 28 '24 03:06 codecov[bot]

测试精度的结果,PR里面展示一下吧。

ZHUI avatar Jul 01 '24 03:07 ZHUI