ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[PROPOSAL]: Improve Automation for Better Development Experience

Open FrankLeeeee opened this issue 2 years ago • 2 comments

Proposal

Overview

We want to support all developers so that they can fully focus on their feature & bug fixes. During the entire development life-cycle, many boring work has to be conducted such as tests. Currently, we have supported many automated workflows with Github Actions. However, there are still room for improvement.

Developer's View

dx

From the developer's perspective, the most important is that anything other than writing code can be automated. Currently, many jobs have been automated besides Test Cache which is cache the test based on code coverage to increase test efficiency.

Organizer's view

org

The organizers must have a clear view of how this project is going. Sometimes, an organizer does not necessarily write code for the project. However, there must be something to report to him/her so that he understands the project is in a healthy status.

The diagrams above shows the different checks that are conducted on a regularly basis. We want to hook these checks with Lark, so as to everyone including the organizers know how the library is performing.

Future Plan

In this section, the tasks that should be conducted are listed and tracked below to give everyone a clear picture of the progress.

ID Task Issue Number
1 Adjust the GPU memory threshold for scheduled build and test #2557
2 Add fail-fast: false to matrix strategies #2553, #2559, #2541
3 Make workflow files independent by event trigger #2553, #2559, #2541
4 Add community report and send to lark #2563, #2566, #2572
5 Detect all GPUs memory availability for scheduled build #2580
6 Hook scheduled build and test with lark #2575
7 Hook scheduled compatibility tests with lark #2587
8 Hook scheduled example test with lark #2583 , #2588
9 Add multiprocessing-supported testmon for test cache -
10 Test with test-pypi before release #2592
11 Hook PyPI release with Lark #2595
12 Hook Docker build with Lark #2593
13 Change bdist build to cuda ext check #2597
14 Fix test coverage report #2610

You can view the project progress in https://github.com/orgs/hpcaitech/projects/8

Self-service

  • [X] I'd be willing to do some initial work on this proposal myself.

FrankLeeeee avatar Feb 05 '23 12:02 FrankLeeeee

Another automation task that is required is doc test.

FrankLeeeee avatar Feb 05 '23 12:02 FrankLeeeee

Another automation task that is required is doc test.

This should be part of the user experience as stated in #2579 .

FrankLeeeee avatar Feb 06 '23 02:02 FrankLeeeee

Thanks for the awsome work!

binmakeswell avatar Apr 18 '23 08:04 binmakeswell