stylegan2-pytorch
stylegan2-pytorch copied to clipboard
Why do you use EMA to update the model rather than directly optimize the model?
Generally EMA improves training stability, especially in GANs.
where does the conclusion comes from
------------------ Original ------------------ From: Kim Seonghyeon @.> Date: Sun,Apr 4,2021 3:28 PM To: rosinality/stylegan2-pytorch @.> Cc: shoutOutYangJie @.>, Author @.> Subject: Re: [rosinality/stylegan2-pytorch] Why do you use EMA to update the model rather than directly optimize the model? (#209)
Generally EMA improves training stability, especially in GANs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
It is common practice, and you can refer to the paper like this: https://arxiv.org/abs/1806.04498
@rosinality Hi,
Could you please tell how the exact accumulation value was chosen? Thank you.
In the code, it's accum = 0.5 ** (32 / (10 * 1000))
, about 0.9977843871238888
@LiUzHiAn It means weight will be decayed to 50% when model is trained with 10000 images with batch size 32.
Oh, I understand. Thanks.