DarrenYing

Results 3 issues of DarrenYing

## Summary I am wondering if oneflow support this kind of operations. For example, I have an input tensor of [1, 3, 200, 200] ( [batch_size, channel, width, height] )...

bug
community

### 📚 The doc issue We provide a [runnable example](https://github.com/hpcaitech/ColossalAI-Examples/tree/main/features/gradient_handler) to demonstrate the use of gradient handler. In this example, we used DataParallelGradientHandler instead of PyTorch DistributedDataParallel for data parallel...

### Description 4个节点,每个节点上4个GPU,我做了一组对比实验 配置一:节点内dp,节点间pp(流水线) ``` P_STAGE1 = flow.placement("cuda", ranks=[0, 1, 2, 3]) P_STAGE2 = flow.placement("cuda", ranks=[4, 5, 6, 7]) P_STAGE3 = flow.placement("cuda", ranks=[8, 9, 10, 11]) P_STAGE4 = flow.placement("cuda", ranks=[12,...