InternEvo
InternEvo copied to clipboard
feat(dataloader): refine implementation of mocked and megatron dataloader
Motivation
- Fix CI timeout for https://github.com/InternLM/InternEvo/issues/342 (Completed)
- Refine implementation of megatron and mocked dataloader (Completed)
Modification
internlm/train/pipeline.pyinternlm/data/*
BC-breaking (Optional)
None
Use cases (Optional)
None
Checklist
Before PR:
- [ ] Pre-commit or other linting tools are used to fix the potential lint issues.
- [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
- [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
- [ ] The documentation has been modified accordingly, like docstring or example tutorials.
After PR:
- [ ] If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
- [ ] CLA has been signed and all committers have signed the CLA in this PR.