ColossalAI
ColossalAI copied to clipboard
[autoparallel] find repeat blocks
📌 Checklist before creating the PR
- [x] I have created an issue for this PR for traceability
- [x] The title follows the standard format:
[doc/gemini/tensor/...]: A concise description - [x] I have added relevant tags if possible for us to better distinguish different PRs
🚨 Issue number
Link this PR to your issue with words like fixed to automatically close the linked issue upon merge
e.g.
fixed #1234,closed #1234,resolved #1234
📝 What does this PR do?
The solving time for auto-parallel intra-op solver is unacceptable for LLMs as the number of layers increasing.
We could do following steps to reduce the solving time for LLMs:
- Find largest repeated blocks.
- Use an alias set to force all repeated blocks sharing a same distributed training strategies.
This PR implements a method to find the largest repeated blocks to solve the first issue.
💥 Checklist before requesting a review
- [x] I have linked my PR to an issue (instruction)
- [x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
- [x] I have performed a self-review of my code
- [x] I have added thorough tests.
- [x] I have added docstrings for all the functions/methods I implemented
⭐️ Do you enjoy contributing to Colossal-AI?
- [x] 🌝 Yes, I do.
- [ ] 🌚 No, I don't.
Tell us more if you don't enjoy contributing to Colossal-AI.
The code coverage for the changed files is 23%.
Click me to view the complete report
Name Stmts Miss Cover
------------------------------------------------------------------------------------------
colossalai/auto_parallel/tensor_shard/utils/factory.py 125 109 13%
tests/test_auto_parallel/test_pass/__init__.py 0 0 100%
tests/test_auto_parallel/test_tensor_shard/test_find_repeat_block.py 73 44 40%
------------------------------------------------------------------------------------------
TOTAL 198 153 23%
The code coverage for the changed files is 23%.
Click me to view the complete report
Name Stmts Miss Cover
------------------------------------------------------------------------------------------
colossalai/auto_parallel/tensor_shard/utils/factory.py 125 109 13%
tests/test_auto_parallel/test_pass/__init__.py 0 0 100%
tests/test_auto_parallel/test_tensor_shard/test_find_repeat_block.py 73 44 40%
------------------------------------------------------------------------------------------
TOTAL 198 153 23%
The code coverage for the changed files is 23%.
Click me to view the complete report
Name Stmts Miss Cover
------------------------------------------------------------------------------------------
colossalai/auto_parallel/tensor_shard/utils/factory.py 123 107 13%
tests/test_auto_parallel/test_pass/__init__.py 0 0 100%
tests/test_auto_parallel/test_tensor_shard/test_find_repeat_block.py 75 46 39%
------------------------------------------------------------------------------------------
TOTAL 198 153 23%