ColossalAI
ColossalAI copied to clipboard
[feat] cuda graph support and refactor non-functional api
📌 Checklist before creating the PR
- [x] I have created an issue for this PR for traceability
- [x] The title follows the standard format:
[doc/gemini/tensor/...]: A concise description - [ ] I have added relevant tags if possible for us to better distinguish different PRs
📝 What does this PR do?
Summarize your work here. if you have any plots/diagrams/screenshots/tables, please attach them here.
- Add CUDA Graph Support
- Refactor some non-functional API to make cuda graph could capture the graph
💥 Checklist before requesting a review
- [ ] I have linked my PR to an issue (instruction)
- [x] My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
- [x] I have performed a self-review of my code
- [ ] I have added thorough tests.
- [x] I have added docstrings for all the functions/methods I implemented
⭐️ Do you enjoy contributing to Colossal-AI?
- [x] 🌝 Yes, I do.
- [ ] 🌚 No, I don't.
Tell us more if you don't enjoy contributing to Colossal-AI.
Please don't merge, still a little bugs to solve. But feel free to review because I've changed some api :)
All bugs are fixed :)
Fix the bugs of dynamic grid for flash decoding, now it passed all the tests and could be merged :).
Unit Test:
All Unit Test:
- After Fix Conflicts with other PR, pass all unit test: