[FEATURE]: can Colossal-AI support customized parallel schedule? is there a tutorial for that?
Describe the feature
As a researcher developing auto parallelism, it would be great if we can use Colossal-AI to test the parallel schedule we design.
Currently, Colossal-AI supports 1D,2D,2.5D, and 3D auto parallelism. But can Colossal-AI support a customized parallel schedule?
For example, if I want to train ResNet-18 on the MNIST dataset.
- For pipeline parallelism, I partition ResNet-18 into 4 submodels: Model:0 to Model:3.
- For data parallelism, I want to simultaneously train 3 batches of data: data:0 to data:2.
- For hardware, if I have 4-GPUs: cuda:0 to cuda:3.
If I want to estimate the performance of the following parallel schedule, what should be done? How to modify the configuration file or the source code to estimate this parallel schedule?

I think you could try config your training with pipeline parallel size 4 and microbatch size 3, and enlarge your training batch size three times. And then run your training in 'forward only' mode. In my view, colossalai will schedule your training like you showed in above picture.
We have updated a lot. This issue was closed due to inactivity. Thanks.