ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
### 🐛 Describe the bug At the stage of booster initialization, some params have wrong dtype of torch.float32 while the precision is set "bf16", and the optimizer initialzation in booster...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
### 🐛 Describe the bug 1. It seems blip2 testing doesn't work correctly at all if model is half precision (torch.float16). 2. With bfloat16, `colossalai.shardformer.layer.FusedLayerNorm` doesn't seem to work correctly....
### Describe the feature > Hugging Face hasn't officially supported the LLaMA models. But the truth is HF has officially supported the LLaMA models.
### 🐛 Describe the bug The type of `error_msgs` is str, we should not re-join it using `\n\t`. https://github.com/hpcaitech/ColossalAI/blob/d83c633ca63c4eef49f3473aa998515fa5ca573f/colossalai/checkpoint_io/general_checkpoint_io.py#L228 otherwise it led to a weird output ``` n\t\'\n\ts\n\tc\n\to\n\tr\n\te\n\t.\n\tw\n\te\n\ti\n\tg\n\th\n\tt\n\t\'\n\t]\n\t"\n\t.\n\t ``` ###...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 🚨 Issue number - [ ] https://github.com/hpcaitech/ColossalAI/issues/5573 ## 📝 What does this PR do? [shardformer/modeling/qwen2]: add qwen2.py and qwen2 policy to support qwen2 model, have passed all the tests...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...