ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
您好,请问下为啥Hybrid Parallel Plugin下TP显存比同配置下deepspeed要高???
### Describe the feature Hello, are there any existing implementations of expert parallel code for the new MoE model, like qwen and deepseek?
### 📚 The doc issue Hi @FrankLeeeee , @ver217 please update the https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/shardformer/README.md file # tensor_parallel_mode: Literal['1d', '2d', '2.5d', '3d'] this line is commented in ShardConfig but documentation is not...
Is ColossalAI free and can I use the codes for free or not?
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
updates: - [github.com/pycqa/isort: 5.13.2 → 7.0.0](https://github.com/pycqa/isort/compare/5.13.2...7.0.0) - [github.com/psf/black-pre-commit-mirror: 24.10.0 → 25.12.0](https://github.com/psf/black-pre-commit-mirror/compare/24.10.0...25.12.0) - [github.com/pre-commit/mirrors-clang-format: v19.1.5 → v21.1.7](https://github.com/pre-commit/mirrors-clang-format/compare/v19.1.5...v21.1.7) - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0)
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug [rank0]: Traceback (most recent call last): [rank0]: File...
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug When fine-tuning a base model (Qwen2.5 3B Base)...