detcon-pytorch icon indicating copy to clipboard operation
detcon-pytorch copied to clipboard

Why using LayerNorm in MLP?

Open pUmpKin-Co opened this issue 3 years ago • 2 comments

Why using layernorm in MLP?

pUmpKin-Co avatar Dec 28 '21 07:12 pUmpKin-Co

I had some issues with the combination of SyncBatchNorm and EMA with distributed training so I just replaced it with LayerNorm as a workaround.

Nothing stopping you from changing it back to batch norm though.

isaaccorley avatar Dec 28 '21 14:12 isaaccorley

Thanks for your reply.In my tiny expermient, I found that layernorm was slghtly worse than batchnorm.So I asked the reason for using layernorm.

pUmpKin-Co avatar Dec 28 '21 15:12 pUmpKin-Co