ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug When training a language model with the GeminiPlugin,...
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug My code is based on Open-Sora, and can...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
updates: - [github.com/psf/black-pre-commit-mirror: 24.8.0 → 24.10.0](https://github.com/psf/black-pre-commit-mirror/compare/24.8.0...24.10.0) - [github.com/pre-commit/mirrors-clang-format: v18.1.8 → v19.1.1](https://github.com/pre-commit/mirrors-clang-format/compare/v18.1.8...v19.1.1) - [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.6.0...v5.0.0)
### Describe the feature https://github.com/linkedin/Liger-Kernel Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
Hello, i want to implement FasterMoE shadow expert base on ColossalAI-MoeHybridParallel. is it possible? how can i achieve it?
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug I want to use nvidia H20 machine to...
## 📝 What does this PR do? Supplementary comparison of the principles of sequence parallel, including ring-attention and Ulysess, and an explanation of their use cases.