Image-Generation-models icon indicating copy to clipboard operation
Image-Generation-models copied to clipboard

Generative models (GAN, VAE, Diffusion Models, Autoregressive Models) implemented with Pytorch, Pytorch_lightning and hydra.

:img-size: 200 :toc: macro ++++

++++ = Colletions of Image Generation Models

image:https://img.shields.io/badge/-Python 3.7--3.9-blue?style=for-the-badge&logo=python&logoColor=white[python, link=https://pytorch.org/get-started/locally/] image:https://img.shields.io/badge/-PyTorch 1.8+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white[pytorch, link=https://pytorch.org/] image:https://img.shields.io/badge/-Lightning 1.3+-792ee5?style=for-the-badge&logo=pytorchlightning&logoColor=white[pytorch_lignthing, link=https://www.pytorchlightning.ai/] image:https://img.shields.io/badge/config-hydra 1.1-89b8cd?style=for-the-badge&labelColor=gray[hydra, link=https://hydra.cc/]

An easily scalable and hierachical framework including lots of image generation method with various datasets.

++++



++++

Highlights 💡: [Highlights:]

  • Various types of image generation methods(Continuous updating): ** GANs: WGAN, InfoGAN, BiGAN ** VAEs: VQ-VAE, Beta-VAE, FactorVAE ** Augoregressive Models: PixelCNN ** Diffusion Models: DDPM
  • Decomposition of model training, datasets and networks:

[source, bash]

python run.py model=wgan networks=conv64 datamodule=celeba exp_name=wgan/celeba_conv64

  • Hierachical configuration of experiment in yaml file ** Manual change of configs in configs/model, configs/datamodule and configs/networks ** Run predefined experiments in configs/experiment

[source, bash]

python run.py experiment=vanilla_gan/cifar10

** Override hyperparameters from command line + [source, bash]

python run.py experiment=vanilla_gan/cifar10 model.lrG=1e-3 model.lrD=1e-3 exp_name=vanilla_gan/custom_lr

  • Run multiple experiments at the same time: ** Grid search of hyperparameters:

[source, bash]

python run.py experiment=vae/mnist_conv model.lr=1e-3,5e-4,1e-4 "exp_name=vae/lr_${model.lr}"

** Run multiple experiments from config files: + [source, bash]

python run.py -m experiment=vae/mnist_conv,vae/cifar10,vae/celeba

== Setup

  • Clone this repo

[source, bash]

git clone https://github.com/Victarry/Image-Generation-models.git

  • Create new python environment using conda and install requirements

[source, bash]

conda env create -n image-generation python=3.10 conda activate image-generation pip install requirement.txt

  • Run your first experiment ✔️

[source, bash]

python run.py experiment=vae/mnist_conv

For different datasets, refer to documentation of datasets.

== Project Structure

== Generative Adversarial Networks(GANs)

=== GAN Generative adversarial nets. + Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio* + NeurIPS 2014. [https://arxiv.org/abs/1406.2661[PDF]] [https://arxiv.org/abs/1701.00160[Tutorial]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/gan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/gan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/gan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== LSGAN Least Squares Generative Adversarial Networks. + Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley. + ICCV 2017. [https://arxiv.org/abs/1611.04076[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/lsgan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/lsgan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/lsgan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== WGAN Wasserstein GAN + Martin Arjovsky, Soumith Chintala, Léon Bottou. + ICML 2017. [https://arxiv.org/abs/1701.07875[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/wgan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/wgan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/wgan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== WGAN-GP Improved training of wasserstein gans + Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville + NeurIPS 2017. [https://arxiv.org/abs/1704.00028[PDF]] [cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/wgangp/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/wgangp/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/wgangp/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== VAE-GAN Autoencoding beyond pixels using a learned similarity metric. + Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther. + ICML 2016. [https://arxiv.org/abs/1512.09300[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/vaegan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/vaegan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/vaegan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== BiGAN/ALI BiGAN Adversarial Feature Learning + Jeff Donahue, Philipp Krähenbühl, Trevor Darrell. + ICLR 2017. [https://arxiv.org/abs/1605.09782[PDF]]

ALI Adversarial Learned Inference + Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville + ICLR 2017. [https://arxiv.org/abs/1606.00704[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/bigan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/bigan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/bigan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== GGAN Geometric GAN + Jae Hyun Lim, Jong Chul Ye. + Arxiv 2017. [https://arxiv.org/abs/1705.02894[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/ggan/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/ggan/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/ggan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== InfoGAN InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets + Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel + NeruIPS 2016. [https://arxiv.org/abs/1606.03657[PDF]]

[cols="5*", options="header"] |=== ^| Manipulated Latent ^| Random samples ^| Discrete Latent (class label) ^| Continuous Latent-1 (rotation) ^| Continuous Latent-2 (thickness)

^.^| Results | image:assets/infogan/random.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/infogan/class.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/infogan/rotation.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/infogan/thickness.jpg[cifar10_conv, {img-size}, {img-size}] |===

== Variational Autoencoders(VAEs)

=== VAE Auto-Encoding Variational Bayes. + Diederik P.Kingma, Max Welling. + ICLR 2014. [https://arxiv.org/abs/1312.6114[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/vae/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/vae/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/vae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== cVAE Learning Structured Output Representation using Deep Conditional Generative Models + Kihyuk Sohn, Honglak Lee, Xinchen Yan. + NeurIPS 2015. [https://papers.nips.cc/paper/2015/hash/8d55a249e6baa5c06772297520da2051-Abstract.html[PDF]]

[cols="3*", options="header"] |=== ^| Dataset ^| MNIST ^| CIFAR10

^.^| Results | image:assets/cvae/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/cvae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== Beta-VAE beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework + Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner. + ICLR 2017. [https://openreview.net/forum?id=Sy2fzU9gl[PDF]]

[cols="3*", options="header"] |=== ^| Dataset ^| CelebA ^| dsprites

^.^| Sample | image:assets/beta_vae/celeba_sample.jpg[celeba, {img-size}, {img-size}] | image:assets/beta_vae/dsprites_sample.jpg[dsprites, {img-size}, {img-size}]

^.^| Latent Interpolation | image:assets/beta_vae/celeba_traverse.jpg[celeba, {img-size}, {img-size}] | image:assets/beta_vae/dsprites_traverse.jpg[dsprites, {img-size}, {img-size}] |===

=== Factor-VAE Disentangling by Factorising + Hyunjik Kim, Andriy Mnih. + NeurIPS 2017. [https://arxiv.org/abs/1802.05983[PDF]]

[cols="3*", options="header"] |=== ^| Dataset ^| CelebA ^| dsprites

^.^| Sample | image:assets/factor_vae/fvae_sample_celeba.jpg[celeba, {img-size}, {img-size}] | image:assets/factor_vae/fvae_dsprites_sample.jpg[dsprites, {img-size}, {img-size}]

^.^| Latent Interpolation | image:assets/factor_vae/fvae_celeba_traverse.jpg[celeba, {img-size}, {img-size}] | image:assets/factor_vae/fvae_dsprites_traverse.jpg[dsprites, {img-size}, {img-size}] |===

=== AAE Adversarial Autoencoders. + Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey. + arxiv 2015. [https://arxiv.org/abs/1511.05644[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/aae/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/aae/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/aae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===

=== AGE AGE Adversarial Generator-Encoder Networks. + Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. + AAAI 2018. [https://arxiv.org/abs/1704.02304[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/age/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | TODO | TODO |===

=== VQ-VAE Neural Discrete Representation Learning. + Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu + NeruIPS 2017. [https://arxiv.org/abs/1711.00937[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Ground truth | image:assets/vqvae/mnist_real.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/vqvae/celeba_real.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/vqvae/cifar10_real.jpg[cifar10_conv, {img-size}, {img-size}]

^.^| Reconstruction | image:assets/vqvae/mnist_recon.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/vqvae/celeba_recon.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/vqvae/cifar10_recon.jpg[cifar10_conv, {img-size}, {img-size}] |===

== Augoregressive Models

=== MADE: Masked Autoencoder for Distribution Estimation MADE: Masked Autoencoder for Distribution Estimation + Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle + ICML 2015. [https://arxiv.org/abs/1502.03509[PDF]]

[cols="2*", options="header"] |=== ^| Dataset ^.^| Samples

^.^| MNIST | image:assets/made/mnist.jpg[mnist_mlp, {img-size}, {img-size}] |===

=== PixelCNN Conditional Image Generation with PixelCNN Decoders + Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu + NeruIPS 2016. [https://arxiv.org/abs/1606.05328[PDF]]

[cols="3*", options="header"] |=== ^| Dataset ^.^| Samples ^.^| Class Condition Samples

^.^| MNIST | image:assets/pixelcnn/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/pixelcnn/mnist_cond.jpg[mnist_mlp, {img-size}, {img-size}] |===

=== Transformer Vanilla transformer based augoregressive models.

[cols="3*", options="header"] |=== ^| Dataset ^.^| Samples ^.^| Class Condition Samples

^.^| MNIST | image:assets/tar/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/tar/mnist_cond.jpg[mnist_mlp, {img-size}, {img-size}] |===

[source, bash]

python run.py experiment=tar/mnist

python run.py experiment=tar/mnist_cond

== Diffusion Models === DDPM Denoising Diffusion Probabilistic Models + Jonathan Ho, Ajay Jain, Pieter Abbeel + NeurIPS 2020. [https://arxiv.org/abs/2006.11239[PDF]]

[cols="4*", options="header"] |=== ^| Dataset ^| MNIST ^| CelebA ^| CIFAR10

^.^| Results | image:assets/ddpm/mnist.jpg[mnist_mlp, {img-size}, {img-size}] | image:assets/ddpm/celeba.jpg[cleba_conv, {img-size}, {img-size}] | image:assets/ddpm/cifar10.jpg[cifar10_conv, {img-size}, {img-size}] |===