PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

[DATA] Update data preprocessing

Open KB-Ding opened this issue 2 years ago • 5 comments

PR types

New features

PR changes

APIs

Description

support megatron dataset.

KB-Ding avatar Jun 20 '23 11:06 KB-Ding

Thanks for your contribution!

paddle-bot[bot] avatar Jun 20 '23 11:06 paddle-bot[bot]

Codecov Report

Attention: Patch coverage is 0.64516% with 462 lines in your changes are missing coverage. Please review.

Project coverage is 62.78%. Comparing base (7c6772b) to head (ed8fbe0). Report is 824 commits behind head on develop.

Files Patch % Lines
paddlenlp/data/indexed_dataset.py 0.00% 460 Missing :warning:
paddlenlp/trainer/trainer.py 60.00% 2 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6223      +/-   ##
===========================================
- Coverage    63.16%   62.78%   -0.39%     
===========================================
  Files          529      530       +1     
  Lines        77184    77645     +461     
===========================================
- Hits         48755    48746       -9     
- Misses       28429    28899     +470     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 20 '23 11:06 codecov[bot]

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

github-actions[bot] avatar Oct 28 '23 00:10 github-actions[bot]

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

github-actions[bot] avatar Apr 28 '24 00:04 github-actions[bot]