hessian_penalty
hessian_penalty copied to clipboard
this project can be attached to stylegan2?
hello, i want to apply Hessian penalty in stylegan2 for fine-tunning, is it available?
You can use our portable implementations of the Hessian Penalty for fine-tuning StyleGAN(-v2) in either PyTorch or TensorFlow.
While it should usually be fine to directly apply the Hessian Penalty to most GAN architectures without modification (e.g., ProGAN, BigGAN, etc.), StyleGAN has a number of quirks that slightly complicate things. We did try training a StyleGAN-v1 with the Hessian Penalty applied to the W vector. What we found is that the generator learns to ignore the first ~4 W vectors input to the main generator and instead offloads control of rotation/pose to the first 4x4 noise image input into the network (which makes sense as we did not regularize those noise images). So, naively fine-tuning a StyleGAN with the Hessian Penalty may not give good results.
To make things work better, you might need to do some of the following and train the StyleGAN from scratch:
- Remove the auxiliary noise images that are input into the generator
- Remove style mixing during training
- Regularize w.r.t. the Z vector instead of W (I think this is more likely to work than regularizing W or W+)
- Regularize w.r.t. each W vector in W+ as a group. In other words, you would be optimizing, say, Hessians of shape 18x18 (assuming there are 18 locations where a W vector is input into the main body of the generator). This can be done by modifying our implementation of
hessian_penalty
by treating each of the 18 W vectors as a unified group. This can be accomplished by sampling a Rademacher tensor of shape(N, 18, 1)
and then adding it to your batch of W+ vectors (which has shape(N, 18, 512)
). Note that StyleGAN is already somewhat disentangled in W+ space (not W space), so it's not clear if this would help things.
Finally, you may need to reduce the dimensionality of the Z/W vector (depending which one you choose to regularize w.r.t.). As we use a stochastic approximator for the Hessian Penalty, it is not necessarily clear that it will be effective for very high dimensional latent codes.