ColossalAI
ColossalAI copied to clipboard
[Inference] Optimized some scattered optimization points in the framework
📌 Checklist before creating the PR
- [ ] I have created an issue for this PR for traceability
- [x] The title follows the standard format:
[doc/gemini/tensor/...]: A concise description - [ ] I have added relevant tags if possible for us to better distinguish different PRs
🚨 Issue number
Link this PR to your issue with words like fixed to automatically close the linked issue upon merge
e.g.
fixed #1234,closed #1234,resolved #1234
📝 What does this PR do?
Summarize your work here. if you have any plots/diagrams/screenshots/tables, please attach them here. pytest:
model benchmark:
| bsz | in_len | out_len | Throughput (tokens/sec) |
|---|---|---|---|
| 16 | 128 | 128 | 1823.16-> 1831.51 |
| 32 | 128 | 128 | 3144.30 -> 3164.13 |
| 64 | 128 | 128 | 5024.28 -> 5130.96 |
| 16 | 128 | 256 | 1791.81-> 1844.73 |
| 32 | 128 | 256 | 3134.06 -> 3153.95 |
| 64 | 128 | 256 | 5056.01 -> 5102.04 |
💥 Checklist before requesting a review
- [ ] I have linked my PR to an issue (instruction)
- [x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
- [x] I have performed a self-review of my code
- [x] I have added thorough tests.
- [x] I have added docstrings for all the functions/methods I implemented
⭐️ Do you enjoy contributing to Colossal-AI?
- [x] 🌝 Yes, I do.
- [ ] 🌚 No, I don't.
Tell us more if you don't enjoy contributing to Colossal-AI.