[Auto Parallel] fix bugs for split_batches_for_accumulation && fix bu…
…gs for enable_delay_scale_loss
PR types
Bug fixes
PR changes
Others
Description
A.修复动态图自动并行下,split_batches_for_accumulation与动手无法对齐的情况。如图
B.修复动态图自动并行下,enable_delay_scale_loss逻辑错误的问题。自动并行默认实现enable_delay_scale_loss,预期行为为:
- 每个micro batch计算出loss
- 反向传播
- 对mini batch内的loss进行求和
- 对loss进行scale,除以acc数
但当前动态图自动并行的行为为:
- 每个micro batch计算出loss
- 对该loss进行scale,除以acc数
- 反向传播
- 对mini batch内的loss进行求和
此外,由于enable_delay_scale_loss逻辑不完善,为自动并行增加开关
C. 修复静态图自动并行下,loss打印展示问题。loss预期的展示行为为:
- 对每个micro batch的loss求和
- 对求和结果进行scale,除以acc数
但当前静态图自动并行loss对打印展示行为为:
- 对每个micro batch的loss进行scale,除以acc数
- 进行求和
D. 修复函数名称错误,将traning修正为training
Thanks for your contribution!
Codecov Report
Attention: Patch coverage is 2.77778% with 35 lines in your changes missing coverage. Please review.
Project coverage is 52.74%. Comparing base (
76a118b) to head (ad0488c). Report is 264 commits behind head on develop.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| paddlenlp/trainer/auto_trainer.py | 0.00% | 34 Missing :warning: |
| paddlenlp/trainer/trainer.py | 50.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #9217 +/- ##
===========================================
- Coverage 53.11% 52.74% -0.38%
===========================================
Files 665 661 -4
Lines 109041 107375 -1666
===========================================
- Hits 57918 56634 -1284
+ Misses 51123 50741 -382
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.