stylegan2-pytorch Why do you use EMA to update the model rather than directly optimize the model?

Why do you use EMA to update the model rather than directly optimize the model?

Open shoutOutYangJie opened this issue 3 years ago • 6 comments

Apr 04 '21 04:04 shoutOutYangJie

Generally EMA improves training stability, especially in GANs.

Apr 04 '21 07:04 rosinality

where does the conclusion comes from

------------------ Original ------------------ From: Kim Seonghyeon @.> Date: Sun,Apr 4,2021 3:28 PM To: rosinality/stylegan2-pytorch @.> Cc: shoutOutYangJie @.>, Author @.> Subject: Re: [rosinality/stylegan2-pytorch] Why do you use EMA to update the model rather than directly optimize the model? (#209)

Generally EMA improves training stability, especially in GANs.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Apr 04 '21 07:04 shoutOutYangJie

It is common practice, and you can refer to the paper like this: https://arxiv.org/abs/1806.04498

Apr 04 '21 15:04 rosinality

@rosinality Hi,

Could you please tell how the exact accumulation value was chosen? Thank you.

In the code, it's accum = 0.5 ** (32 / (10 * 1000)), about 0.9977843871238888

Dec 31 '21 03:12 LiUzHiAn

@LiUzHiAn It means weight will be decayed to 50% when model is trained with 10000 images with batch size 32.

Dec 31 '21 04:12 rosinality

Oh, I understand. Thanks.

Dec 31 '21 05:12 LiUzHiAn

stylegan2-pytorch stylegan2-pytorch copied to clipboard

Why do you use EMA to update the model rather than directly optimize the model?

stylegan2-pytorch
stylegan2-pytorch copied to clipboard