Sungha Choi

Results 5 comments of Sungha Choi

@M3Dade Hi, have you resolved the "wrong checksum" issue? Best,

I encountered the same issue. Is there any solution? > ... /deepspeed/runtime/bf16_optimizer.py", line 312, in step > [rank0]: assert all_groups_norm > 0. > [rank0]: AssertionError deepspeed 0.15.0 transformers 4.44.2

> I encountered the same issue. Is there any solution? > > > ... /deepspeed/runtime/bf16_optimizer.py", line 312, in step > > [rank0]: assert all_groups_norm > 0. > > [rank0]: AssertionError...

Hi @vacancy, Thanks a lot for your reply :) I have tested sync batch norm on deeplab-resnet based segmentation task. When I applied sync batch norm, it consumes about 30-40%...