ColossalAI
ColossalAI copied to clipboard
[lazyinit] add verification for distributed cases
Overview
This work should be started after #3148 . And then we have ability create a model with lazy initialiazation and sharding. We have to verify the correctness for distributed training (tensor parallel or zero-3) cases.
Wanna track the development progress? Take a look at
proposal: https://github.com/hpcaitech/ColossalAI/discussions/3124 kanban: https://github.com/orgs/hpcaitech/projects/20
Goal
Verify the correctness of lazy init for distributed cases.