StyleAlign
StyleAlign copied to clipboard
StyleAlign: Analysis and Applications of Aligned StyleGAN Models
StyleAlign: Analysis and Applications of Aligned StyleGAN Models
Zongze Wu, Yotam Nitzan, Eli Shechtman, Dani Lischinski
https://openreview.net/pdf?id=Qg2vi4ZbHM9Abstract: In this paper, we perform an in-depth study of the properties and applications of aligned generative models. We refer to two models as aligned if they share the same architecture, and one of them (the child) is obtained from the other (the parent) via fine-tuning to another domain, a common practice in transfer learning. Several works already utilize some basic properties of aligned StyleGAN models to perform image-to-image translation. Here, we perform the first detailed exploration of model alignment, also focusing on StyleGAN. First, we empirically analyze aligned models and provide answers to important questions regarding their nature. In particular, we find that the child model's latent spaces are semantically aligned with those of the parent, inheriting incredibly rich semantics, even for distant data domains such as human faces and churches. Second, equipped with this better understanding, we leverage aligned models to solve a diverse set of tasks. In addition to image translation, we demonstrate fully automatic cross-domain image morphing. We further show that zero-shot vision tasks may be performed in the child domain, while relying exclusively on supervision in the parent domain. We demonstrate qualitatively and quantitatively that our approach yields state-of-the-art results, while requiring only simple fine-tuning and inversion.
usage
Train a parent StyleGAN model in domain A, then use the parent model weights as initiation for child model (by adding the --resume flag) and fine tune it in domain B. In this way, we obtain the aligned parent and child models, and we could perform image translation or morphing using the following codes.
pretrained checkpoint
The pretrained checkpoints could be downloaded from here. The FFHQ model is from StyleGAN2 repo. The FFHQ512, FFHQ512_dog, FFHQ512_cat, FFHQ512_wild models are from StyleGAN2-ada repo. Other models are trained or fine tuning by ourselves.
To download all checkpoints:
gdown --fuzzy 'https://drive.google.com/drive/folders/1MqCHQ6Yx-eon-3fu1g_AGjpyAUmzH6Jy?usp=sharing' -O /checkpoint --folder
Image-to-Image Translation
source_img_path='./example/dog/'
source_path='./img_invert/ffhq512_dog/z/' # path for saving inverted latent codes and images
target_path='./img_invert/ffhq512_dog/translate/cat/' #path for saving translation images
source_pkl='./checkpoint/ffhq512_dog.pkl'
target_pkl='./checkpoint/ffhq512_dog_cat.pkl'
compare_html='./img_invert/ffhq512_dog/translate/cat.html'
python projector_z.py --outdir=$source_path \
--target=$source_img_path \
--network=$source_pkl
python I2I.py --network $target_pkl \
--source_path $source_path \
--target_path $target_path
python Compare.py --source_img_path $source_img_path \
--source_path $source_path \
--target_path $target_path \
--save_path $compare_html
Cross-domain Image Morphing
To morph image from different domains, please train an e4e encoder in each doamin, and invert the images into w+ space. We provide pretrained e4e models for FFHQ512, FFHQ512_dog, FFHQ512_dog_cat in here. We use w+ space for better image reconstruction (compared to z space).
source_pkl='./checkpoint/ffhq512_dog.pkl'
target_pkl='./checkpoint/ffhq512_dog_cat.pkl'
source_latent='./img_invert/ffhq512_dog/e4e_w_plus/flickr_dog_000043.npy' #w_plus
target_latent='./img_invert/ffhq512_dog_cat/e4e_w_plus/flickr_cat_000008.npy' #w_plus
python MergeFace.py --source_pkl $source_pkl --target_pkl $target_pkl --source_latent $source_latent --target_latent $target_latent
We can also translate an image from source to target domian and create a smooth video. We use w+ space in source domain for better reconstruction, and z space in target domain for better translation. Please add --target_is_z flag in the end.
source_pkl='./checkpoint/ffhq512_dog.pkl'
target_pkl='./checkpoint/ffhq512_dog_cat.pkl'
source_latent='./img_invert/ffhq512_dog/e4e_w_plus/flickr_dog_000045.npy' # w
target_latent='./img_invert/ffhq512_dog/z/flickr_dog_000045.npz' # z
python MergeFace.py --source_pkl $source_pkl --target_pkl $target_pkl --source_latent $source_latent --target_latent $target_latent --target_is_z
Shared Semantic Controls Between Parent and Child Models
Image Translation
Cross-domain Image Morphing
Knowledge Transfer from Parent to Child Domain
Citation
If you use this code for your research, please cite our paper:
@article{wu2021stylealign,
title={StyleAlign: Analysis and Applications of Aligned StyleGAN Models},
author={Wu, Zongze and Nitzan, Yotam and Shechtman, Eli and Lischinski, Dani},
journal={arXiv preprint arXiv:2110.11323},
year={2021}
}