ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1091 ColossalAI issues
Sort by recently updated
recently updated
newest added

### 🐛 Describe the bug pretrain llama2-7b can resume when using "zero2" plugin, but can not resume when using "gemini" plugin, when using "gemini" plugin, the resume process will stuck,...

bug

### 🐛 Describe the bug question:When I trained using vit on the Imagenet-1k and Cifar-10 datasets, I repeatedly adjusted the parameter configuration according to the official vit configuration, but the...

bug

### 🐛 Describe the bug --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 1 ----> 1 from colossalai.booster import Booster File ~/.local/lib/python3.11/site-packages/colossalai/booster/__init__.py:2 1 from .accelerator import Accelerator ---->...

bug

### 📚 The doc issue May I ask what is the datasetset used to train the Colossal-Llama-2?

documentation

### 🐛 Describe the bug File "/data/llmodel/miniconda3/envs/colossal/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/llmodel/miniconda3/envs/colossal/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/data/llmodel/huap/ColossalAI/applications/Colossal-LLaMA-2/colossal_llama2/utils/flash_attention_patch.py", line 133, in attention_forward cos,...

bug

### Describe the feature We are excited to announce the addition of support for the qwen2 model in the ColossalAI framework. The qwen2 model is compatible with version 4.39.3 of...

enhancement

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...

### 🐛 Describe the bug I noticed that `from_torch_tensor` method of class `ColoParameter` and `ColoTensor` have been removed in PR #4479 ([`colossalai/tensor/colo_parameter.py`](https://github.com/hpcaitech/ColossalAI/pull/4479/files#diff-0d13ce3fae72d4ebe67bce9ef2441e4495a6aeee40c5532c30a985e79bc57cb6L66), [`colossalai/tensor/colo_tensor.py`](https://github.com/hpcaitech/ColossalAI/pull/4479/files#diff-0eee6bc157c59a4fb490823d53da0647d9793793bc4669f3e41146d3d99c7dd3L265)). But this method was still called under...

bug

### 🐛 Describe the bug When using tensor parallelism, model parameters are sharded across GPUs to reduce its memory consumption and parallel execution. However, the optimizer still holds unsharded model...

bug

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...