ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
### 🐛 Describe the bug root@autodl-container-8450119b52-890be3f8:~# colossalai run --nproc_per_node 1 train.py --use_trainer /bin/bash: line 0: export: `=/usr/bin/supervisord': not a valid identifier Error: failed to run torchrun --nproc_per_node=1 --nnodes=1 --node_rank=0 --rdzv_backend=c10d...
ColossalAI/applications/Chat/coati/trainer/ppo.py: replay_buffer = NaiveReplayBuffer(train_batch_size, buffer_limit, buffer_cpu_offload) Because this is constructing experimental data,should the train_batch_size in the above code be experience_batch_size?
### Describe the feature In https://github.com/hpcaitech/GPT-Demo?tab=readme-ov-file How to Prepare Webtext Dataset You can download the preprocessed sample dataset for this demo via our [Google Drive sharing link](https://drive.google.com/file/d/1QKI6k-e2gJ7XgS8yIpgPPiMmwiBP_BPE/view?usp=sharing). we can see...
## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...
Tracking for implementation of triton kernels compatible with relevant submodules and KVCache for inference. - Context-stage Attention https://github.com/hpcaitech/ColossalAI/pull/5192 - Decoding-stage Attention - Pos Embedding - https://github.com/hpcaitech/ColossalAI/pull/5181 - KVCache Copy
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
Development branch: https://github.com/hpcaitech/ColossalAI/tree/feat/speculative-decoding