ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

Making large AI models cheaper, faster and more accessible

Results 1072 ColossalAI issues
Sort by recently updated
recently updated
newest added

### Describe the feature How to open activation checkpoint offload, anyone can help me solve this?

enhancement

I trained Llama2-7B-chat on the Alpaca dataset, and when I set the batch size to 2 or 4, "INFO: Found overflow. Skip step. " appeared at each step of the...

### 🐛 Describe the bug When I enable the optimization options inside the gemini_auto plugin, I encounter errors, such as TypeError: GeminiPlugin.init() got an unexpected keyword argument 'enable_flash_attention'. ### Environment...

bug

### 🐛 Describe the bug 代码问题太多了,建议重新审核维护 ### Environment _No response_

bug

### 🐛 Describe the bug The current implementation of WarmupScheduler does not include the functionality to load the after_scheduler part of the parameters. This omission leads to a scenario where...

bug

### 🐛 Describe the bug A100*80G*8卡的机器,batch_size=1,7B的llama-2模型,train_sft.py和train_reward_model.py都跑不起来 ### Environment You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations...

bug

## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...

### Describe the feature I appreciate your great work of releasing [llama 2 model](https://github.com/hpcaitech/ColossalAI/tree/785802e809ccf26b3864ae811dc908ecdf601a70/applications/Colossal-LLaMA-2). When will Data Processing Toolkit be released?

enhancement

### 🐛 Describe the bug I was trying to reproduce the benchmark results on https://github.com/hpcaitech/ColossalAI/blob/main/applications/Chat/README.md which says: > DeepSpeedChat performance comes from its blog on 2023 April 12, ColossalChat performance...

bug

## 📝 What does this PR do? Added support for batch_encoding for to_device method based on Issue #4489 Fixes #4489