ColossalAI
ColossalAI copied to clipboard
[PROPOSAL]: Improve Automation for Better Development Experience
Proposal
Overview
We want to support all developers so that they can fully focus on their feature & bug fixes. During the entire development life-cycle, many boring work has to be conducted such as tests. Currently, we have supported many automated workflows with Github Actions. However, there are still room for improvement.
Developer's View
From the developer's perspective, the most important is that anything other than writing code can be automated. Currently, many jobs have been automated besides Test Cache
which is cache the test based on code coverage to increase test efficiency.
Organizer's view
The organizers must have a clear view of how this project is going. Sometimes, an organizer does not necessarily write code for the project. However, there must be something to report to him/her so that he understands the project is in a healthy status.
The diagrams above shows the different checks that are conducted on a regularly basis. We want to hook these checks with Lark, so as to everyone including the organizers know how the library is performing.
Future Plan
In this section, the tasks that should be conducted are listed and tracked below to give everyone a clear picture of the progress.
ID | Task | Issue Number |
---|---|---|
1 | Adjust the GPU memory threshold for scheduled build and test | #2557 |
2 | Add fail-fast: false to matrix strategies |
#2553, #2559, #2541 |
3 | Make workflow files independent by event trigger | #2553, #2559, #2541 |
4 | Add community report and send to lark | #2563, #2566, #2572 |
5 | Detect all GPUs memory availability for scheduled build | #2580 |
6 | Hook scheduled build and test with lark | #2575 |
7 | Hook scheduled compatibility tests with lark | #2587 |
8 | Hook scheduled example test with lark | #2583 , #2588 |
9 | Add multiprocessing-supported testmon for test cache | - |
10 | Test with test-pypi before release | #2592 |
11 | Hook PyPI release with Lark | #2595 |
12 | Hook Docker build with Lark | #2593 |
13 | Change bdist build to cuda ext check | #2597 |
14 | Fix test coverage report | #2610 |
You can view the project progress in https://github.com/orgs/hpcaitech/projects/8
Self-service
- [X] I'd be willing to do some initial work on this proposal myself.
Another automation task that is required is doc test.
Another automation task that is required is doc test.
This should be part of the user experience as stated in #2579 .
Thanks for the awsome work!