ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1091 ColossalAI issues
Sort by recently updated
recently updated
newest added

您好,请问下为啥Hybrid Parallel Plugin下TP显存比同配置下deepspeed要高???

### Describe the feature Hello, are there any existing implementations of expert parallel code for the new MoE model, like qwen and deepseek?

enhancement

### 📚 The doc issue Hi @FrankLeeeee , @ver217 please update the https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/shardformer/README.md file # tensor_parallel_mode: Literal['1d', '2d', '2.5d', '3d'] this line is commented in ShardConfig but documentation is not...

documentation

Is ColossalAI free and can I use the codes for free or not?

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

updates: - [github.com/pycqa/isort: 5.13.2 → 7.0.0](https://github.com/pycqa/isort/compare/5.13.2...7.0.0) - [github.com/psf/black-pre-commit-mirror: 24.10.0 → 25.12.0](https://github.com/psf/black-pre-commit-mirror/compare/24.10.0...25.12.0) - [github.com/pre-commit/mirrors-clang-format: v19.1.5 → v21.1.7](https://github.com/pre-commit/mirrors-clang-format/compare/v19.1.5...v21.1.7) - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0)

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug [rank0]: Traceback (most recent call last): [rank0]: File...

bug

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug When fine-tuning a base model (Qwen2.5 3B Base)...

bug