hessian_penalty icon indicating copy to clipboard operation
hessian_penalty copied to clipboard

this project can be attached to stylegan2?

Open chensjtu opened this issue 3 years ago • 1 comments

hello, i want to apply Hessian penalty in stylegan2 for fine-tunning, is it available?

chensjtu avatar Aug 28 '20 04:08 chensjtu

You can use our portable implementations of the Hessian Penalty for fine-tuning StyleGAN(-v2) in either PyTorch or TensorFlow.

While it should usually be fine to directly apply the Hessian Penalty to most GAN architectures without modification (e.g., ProGAN, BigGAN, etc.), StyleGAN has a number of quirks that slightly complicate things. We did try training a StyleGAN-v1 with the Hessian Penalty applied to the W vector. What we found is that the generator learns to ignore the first ~4 W vectors input to the main generator and instead offloads control of rotation/pose to the first 4x4 noise image input into the network (which makes sense as we did not regularize those noise images). So, naively fine-tuning a StyleGAN with the Hessian Penalty may not give good results.

To make things work better, you might need to do some of the following and train the StyleGAN from scratch:

  • Remove the auxiliary noise images that are input into the generator
  • Remove style mixing during training
  • Regularize w.r.t. the Z vector instead of W (I think this is more likely to work than regularizing W or W+)
  • Regularize w.r.t. each W vector in W+ as a group. In other words, you would be optimizing, say, Hessians of shape 18x18 (assuming there are 18 locations where a W vector is input into the main body of the generator). This can be done by modifying our implementation of hessian_penalty by treating each of the 18 W vectors as a unified group. This can be accomplished by sampling a Rademacher tensor of shape (N, 18, 1) and then adding it to your batch of W+ vectors (which has shape (N, 18, 512)). Note that StyleGAN is already somewhat disentangled in W+ space (not W space), so it's not clear if this would help things.

Finally, you may need to reduce the dimensionality of the Z/W vector (depending which one you choose to regularize w.r.t.). As we use a stochastic approximator for the Hessian Penalty, it is not necessarily clear that it will be effective for very high dimensional latent codes.

wpeebles avatar Aug 28 '20 05:08 wpeebles