ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1072 ColossalAI issues
Sort by recently updated
recently updated
newest added

### 🐛 Describe the bug when i run the example in your tutorials (basic/colotensor), I met some problems. Traceback (most recent call last): File "colossalai-study/run_dist.py", line 8, in from colossalai.testing...

bug

### 🐛 Describe the bug hi, how can i fine-tuning the glm-130b model based on colossal-ai? glm-130b: https://keg.cs.tsinghua.edu.cn/glm-130b/zh/posts/glm-130b/ ### Environment _No response_

bug

### 🐛 Describe the bug I get `CUDA out of memory. Tried to allocate 25.10 GiB` when run `train_sft.sh`, I t need 25.1GB, and My GPU is V100 and memory...

bug

### 🐛 Describe the bug I executed the training command of supervised instructs tuning for the Coati following the instruction in the README.md. It raised the error related to NCCL...

bug

### 🐛 Describe the bug tried to run train_sft.sh with error: OOM orch.cuda.OutOfMemoryError: C**UDA out of memory. Tried to allocate 1**72.00 MiB (GPU 0; 23.68 GiB total capacity; 18.08 GiB...

bug

### Describe the feature Currently FP16 support can only make it possible for training models smaller than 2B in one graphic card with 24gb ram. However the main stream useful...

enhancement

## 📌 Checklist before creating the PR - [ x] I have created an issue for this PR for traceability - [ x] The title follows the standard format: `[doc/gemini/tensor/...]:...

### 📚 The doc issue 希望官方能优化文档,提供较为详细的部署训练步骤。

documentation