ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
### 🐛 Describe the bug **I use gemini plugin and set tp>1, the error occurs when saving optimizer.** Traceback (most recent call last): File "/mnt/lustre/tangyang2/wangzihan/ColossalAI/applications/Colossal-LLaMA-2/pretrain_np.py", line 480, in main() File...
### 🐛 Describe the bug okay I am new to parallel distributed training. I was following along the basic tutorials provided in https://colossalai.org/docs/basics/launch_colossalai.  I am currently trying to check...
### 🐛 Describe the bug What is the difference between train.sh and train_sft.sh? use_neft this parameter? the loss of STF should pay attention to the answer(QA) ### Environment _No response_
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
### 🐛 Describe the bug I have installed colossalai-0.3.6 from source . When I ran the example **_colossalai run --nproc_per_node 4 auto_parallel_with_resnet.py_** I met the error as titled  How...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...