ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1072 ColossalAI issues
Sort by recently updated
recently updated
newest added

### πŸ› Describe the bug - 运葌sh examples/train_sft.sh ![image](https://user-images.githubusercontent.com/22451062/233003591-6f777795-54cf-4f57-8f1b-184a7bdfde7d.png) - ζŠ₯ι”™δΏ‘ζ―ε¦‚δΈ‹οΌš [04/19/23 15:25:30] INFO colossalai - colossalai - INFO: /home/jovyan/work/projects/Example/ColossalAI/venv/lib/python3.8/site-packages/colossalai/context/parallel_context.py:522 set_device INFO colossalai - colossalai - INFO: process rank 0...

bug

### πŸ› Describe the bug I was trying to run: torchrun --standalone --nproc_per_node=2 train_dummy.py --strategy colossalai_zero2 under applications/Chat/examples, and got this error. I tried possible solutions mentioned in other previous...

bug

### πŸ› Describe the bug Traceback (most recent call last): File "train_sft.py", line 175, in train(args) File "train_sft.py", line 146, in train train(args) File "train_sft.py", line 146, in train trainer.fit(logger=logger,...

bug

### πŸ› Describe the bug After the Llama model is trained using Lora training method, the model can be saved normally. However, Lora's model parameters were not included in the...

bug

## πŸ“Œ Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...

Run Build and Test
API

### πŸ› Describe the bug Hi colossalai, I am trying to use colossalai to fine-tune stable diffusion. In the code, optimizer is defined as GeminiAdamOptimizer. I used the following code...

bug

### πŸ› Describe the bug CUDA_VISIBLE_DEVICES=6 python train.py Traceback (most recent call last): File "train.py", line 13, in from colossalai.utils.model.colo_init_context import ColoInitContext ModuleNotFoundError: No module named 'colossalai.utils.model.colo_init_context' ### Environment absl-py...

bug

gemini plugin support shard checkpoint to avoid large checkpoint files.

### πŸ› Describe the bug **Describe the bug** docker.io/hpcaitech/colossalai:0.2.x (x > 0) report Colossal AI version 0.2.0 and contain non-release tagged code from >0.2.0 and

bug

### Discussed in https://github.com/hpcaitech/ColossalAI/discussions/3606 Originally posted by **cryoco** April 19, 2023 I've seen 2 ray implementations of PPO in this repo, [#3195 ](https://github.com/hpcaitech/ColossalAI/pull/3195) and [#3309 ](https://github.com/hpcaitech/ColossalAI/pull/3309). The former makes the...