ColossalAI issues

[BUG]: On the eight-card A100, testing the 'examples/language/llama2' with the 'gemini_auto' plugin resulted in an 'out of memory' error."

1

### 🐛 Describe the bug Here are my script, it can run with hybrid_parallel plugin, but other plugins have the same error "out of memory" torchrun --standalone --nproc_per_node 8 finetune.py...

chensimian

bug

可以支持mistral模型吗

7

### Describe the feature You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can...

cy565025164

enhancement

[BUG]: how to disable FlashAttention

1

### 🐛 Describe the bug RuntimeError: FlashAttention only supports Ampere GPUs or newer. ### Environment colossal 0.3.4 ColossalAI/examples/language/opt# bash run_demo.sh + pip install -r requirements.txt

SeekPoint

bug

[BUG]: Build from source failed

2

### 🐛 Describe the bug I want to build from source the entire project, and failed on `pip install .`, the error seems to be pytorch header related. I'm using...

fwd4

bug

[hotfix] chat train with lora while tensor does not allocate memory.

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

flybird11111

[hotfix] fix for loading after_scheduler state

1

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

imgaojun

[BUG]: How to run llama2 70B pretrain on 32gpus? I got OOM error on almost every plugin and config.

3

### 🐛 Describe the bug gemini / gemini_auto / zero2 / hybrid_parallel I have tried and still got OOM error. with hybrid_parallel plugin , I tried configs as follows: 1....

yeegnauh

bug

[hotfix/shardformer] Rename the reducescatter_forward_gather_backward operation

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

FoolPlayer

[LowLevelZero]: low level zero support lora.

1

### Describe the feature To reduce memory usage, support for LoRa is provided.

flybird11111

enhancement

[PROPOSAL]: Colossal-Infer

### Proposal DOC: https://n4fyd3ptax.feishu.cn/docx/MhlmdHsGkoeoslx9fqucPO17n9b ### Self-service - [ ] I'd be willing to do some initial work on this proposal myself.

CjhHa1

enhancement

ColossalAI
ColossalAI copied to clipboard

Metadata

[BUG]: On the eight-card A100, testing the 'examples/language/llama2' with the 'gemini_auto' plugin resulted in an 'out of memory' error."

可以支持mistral模型吗

[BUG]: how to disable FlashAttention

[BUG]: Build from source failed

[hotfix] chat train with lora while tensor does not allocate memory.

[hotfix] fix for loading after_scheduler state

[BUG]: How to run llama2 70B pretrain on 32gpus? I got OOM error on almost every plugin and config.

[hotfix/shardformer] Rename the reducescatter_forward_gather_backward operation

[LowLevelZero]: low level zero support lora.

[PROPOSAL]: Colossal-Infer

← Metadata

Owner

Metadata

ColossalAI ColossalAI copied to clipboard

Metadata

← Metadata

Owner

Metadata

ColossalAI
ColossalAI copied to clipboard