ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1091 ColossalAI issues
Sort by recently updated
recently updated
newest added

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug When training a language model with the GeminiPlugin,...

bug

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug My code is based on Open-Sora, and can...

bug

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

updates: - [github.com/psf/black-pre-commit-mirror: 24.8.0 → 24.10.0](https://github.com/psf/black-pre-commit-mirror/compare/24.8.0...24.10.0) - [github.com/pre-commit/mirrors-clang-format: v18.1.8 → v19.1.1](https://github.com/pre-commit/mirrors-clang-format/compare/v18.1.8...v19.1.1) - [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.6.0...v5.0.0)

### Describe the feature https://github.com/linkedin/Liger-Kernel Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory...

enhancement

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

Hello, i want to implement FasterMoE shadow expert base on ColossalAI-MoeHybridParallel. is it possible? how can i achieve it?

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug I want to use nvidia H20 machine to...

bug

## 📝 What does this PR do? Supplementary comparison of the principles of sequence parallel, including ring-attention and Ulysess, and an explanation of their use cases.