ColossalAI
ColossalAI copied to clipboard
[DTensor] refactor CommSpec
📌 Checklist before creating the PR
- [x] I have created an issue for this PR for traceability
- [x] The title follows the standard format:
[doc/gemini/tensor/...]: A concise description - [x] I have added relevant tags if possible for us to better distinguish different PRs
🚨 Issue number
resolved #3035
📝 What does this PR do?
The previous CommSpec used in Auto-Parallel module contains some attributes which is not necessary to describe a communication operation, such as ShardingSpec, DeviceMesh.
Previously,It has two main functions:
- Compute the communication cost which will be used in auto parallel solver.
- Convert the communication spec to real action which will be used in runtime.
However,the first function may not be necessary if we jump out of the auto-parallel scenario.
To make a clean design, the new CommSpec will just contains the attributes to decribe a communication operation and supply a function to apply the CommSpec to real execution.

💥 Checklist before requesting a review
- [x] I have linked my PR to an issue (instruction)
- [x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
- [x] I have performed a self-review of my code
- [x] I have added thorough tests.
- [x] I have added docstrings for all the functions/methods I implemented
⭐️ Do you enjoy contributing to Colossal-AI?
- [x] 🌝 Yes, I do.
- [ ] 🌚 No, I don't.
Tell us more if you don't enjoy contributing to Colossal-AI.