ColossalAI
ColossalAI copied to clipboard
Making large AI models cheaper, faster and more accessible
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### π Describe the bug loading sharded model does not raise an error...
β¦cessary copies of the data ## π Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title...
## π Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### π Describe the bug Posting this for documentation purposes. torch.compile has been...
### Describe the feature I build a custom network with some custom operators, e.g., flash-attention. But I found neither colotracer nor node_handler can deal with them successfully. But I don't...
When benchmarking a LLM, token/s is also an important metric, so this metric is added to the llama benchmark.
### Is there an existing issue for this bug? - [X] I have searched the existing issues ### π Describe the bug The line **808** of `zero/low_level/low_level_optim.py` assumes that every...