Jiarui Fang(方佳瑞)

Results 63 issues of Jiarui Fang(方佳瑞)

### What's the PR for Previously, Gemini is only checked on GPT2. After increasing the types of test models, the unitest test fails. You can merge this PR. But @1SAA...

Run Build and Test

bert grad_checkpoint = True时,runtime tracer和gemini的tracer检测的 采样点数对不上。

### Why Make sure the user can run OPT to profile performance in 1 minute. No data download, no complex training parameter setting. Just simply run a few iterations.

I convert this to draft, because GeminiDDP can not run inference. The states will be wrong.

## What's new This PR removes the dependency of LowLevelZeroOptimizer on gpc. gpc is a global variable. If use it, we can not use LowLevelZeroOptimizer together with ColoTensor TP. This...

Run Build and Test

### Describe the feature ## Current States Currently, GeminiDDP has to shard the optimizer state, and it has covered zero3 and zero3+offload. However, It doesn't actually cover zero1 and zero3....

enhancement