ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A...
### 🐛 Describe the bug The script file I run is `gpt/gemini/run_gemini.sh`, which runs on 2 GPU. The rest of the code was unchanged. The model i used is gpt...
Updated GitHub documentation: 1. requirements.txt content is incompatible with the version and cannot be downloaded. It is recommended to change the content opencv-python==4.6.0 to opencv-python==4.6.0.66 (otherwise an error will be...
### 🐛 Describe the bug I met an error when I was trying to start the environment by downloading the requirements.txt. The error says it can not find a compatible...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
### 🐛 Describe the bug Hello, I am training OPT model on the A100 GPUs. I found it used 76GB GPU memory when I use `auto` mode and set `gpu_margin_mem_ratio`...
### 🐛 Describe the bug I use examples to run gpt2 with pipeline mode. The command is in examples/language/gpt/experiments/pipeline_parallel/run.sh. The error is Traceback (most recent call last): File "/home/guest_01/ColossalAI/colossalai/fx/tracer/tracer.py", line...
### 🐛 Describe the bug The torchrec model in the test kit's model zoo will lead to test error in `test_booster` and `test_fx`. ### Environment _No response_
### Discussed in https://github.com/hpcaitech/ColossalAI/discussions/3156 Originally posted by **bobo0810** March 17, 2023 对于conv、linear等基础算子,官方列表是否可以清晰列出 Tensor并行 的生效范围呢?