Wang Binluo
Wang Binluo
## 📌 Checklist before creating the PR - [ ] I have created an issue for this PR for traceability - [ ] The title follows the standard format: `[doc/gemini/tensor/...]:...
## 🚨 Issue number - [ ] https://github.com/hpcaitech/ColossalAI/issues/5573 ## 📝 What does this PR do? [shardformer/modeling/qwen2]: add qwen2.py and qwen2 policy to support qwen2 model, have passed all the tests...
### Describe the feature We are excited to announce the addition of support for the qwen2 model in the ColossalAI framework. The qwen2 model is compatible with version 4.39.3 of...
### Describe the feature Shardformer was originally developed based on transformers==4.33.0. In response to our users' needs, it needs to be upgraded to version 4.36.0. The main changes involve the...
Support parallel output function for shardformer models.
## 📝 What does this PR do? Supplementary comparison of the principles of sequence parallel, including ring-attention and Ulysess, and an explanation of their use cases.