PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

[llm]support long sequence training

Open lugimzzz opened this issue 1 year ago • 3 comments

PR types

New features

PR changes

Others

Description

新增支持单机8卡 llama 128k训练 待解决问题:

  1. 如果fuse_fused_head_and_loss_fn,开启pp和开eval的时候loss异常需要排查

lugimzzz avatar Sep 26 '24 11:09 lugimzzz

Thanks for your contribution!

paddle-bot[bot] avatar Sep 26 '24 11:09 paddle-bot[bot]

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 26 '24 11:09 CLAassistant

Codecov Report

Attention: Patch coverage is 13.55932% with 51 lines in your changes missing coverage. Please review.

Project coverage is 53.01%. Comparing base (ad14dc4) to head (1f14291). Report is 476 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/llama/modeling.py 10.00% 27 Missing :warning:
paddlenlp/transformers/tensor_parallel_utils.py 10.00% 18 Missing :warning:
paddlenlp/transformers/llama/fusion_ops.py 0.00% 4 Missing :warning:
paddlenlp/trl/dpo_criterion.py 0.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9208      +/-   ##
===========================================
- Coverage    53.06%   53.01%   -0.05%     
===========================================
  Files          656      656              
  Lines       106147   106159      +12     
===========================================
- Hits         56324    56281      -43     
- Misses       49823    49878      +55     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Sep 26 '24 12:09 codecov[bot]

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

github-actions[bot] avatar Dec 16 '24 00:12 github-actions[bot]