Full-Glow

This repo contains the implementation of Full-Glow: Fully conditional Glow for more realistic image generation: https://arxiv.org/abs/2012.05846. A short presentation of the work could be seen here.

Full-Glow extends on previous Glow-based models for conditional image generation by applying conditioning to all Glow operations using appropriate conditioning networks. It was applied to the Cityscapes dataset (label → photo) for synthesizing street-scene images.

Quantitative results

Full-Glow was evaluated quantitatively against previous Glow-based models (C-Glow and DUAL-Glow) along with the GAN-based model pix2pix using the PSPNet classifier. With each trained model, we did inference on the Cityscapes validation set 3 times and calculated the PSP scores.

Model	Conditional BPD ↓	Mean pixel acc. ↑	Mean class acc. ↑	Mean class IoU ↑
C-Glow v.1	2.568	35.02 ± 0.56	12.15 ± 0.05	7.33 ± 0.09
C-Glow v.2	2.363	52.33 ± 0.46	17.37 ± 0.21	12.31 ± 0.24
Dual-Glow	2.585	71.44 ± 0.03	23.91 ± 0.19	18.96 ± 0.17
pix2pix	---	60.56 ± 0.11	22.64 ± 0.21	16.42 ± 0.06
Full-Glow	2.345	73.50 ± 0.13	29.13 ± 0.39	23.86 ± 0.30
Ground-truth	---	95.97	84.31	77.30

Visual samples in 512x1024 resolution (please zoom to see more details)

                          Condition                        Synthesized

Visual examples of content transfer (please zoom to see more details)

Images from left to right: Desired content - Desired structure - Content applied to structure - Ground-truth for structure

Samples generated on the maps dataset

Top row: Condition, Bottom row: synthesized

Top row: Condition, Bottom row: synthesized

Training

To train a model on e.g. Cityscapes, one can run:
python3 main.py --model improved_so_large_longer --img_size 512 1024 --dataset cityscapes --direction label2photo --n_block 4 --n_flow 8 8 8 8 --do_lu --reg_factor 0.0001 --grad_checkpoint

Arguments

--model indicates the name of the model (should have 'improved' in the name to enable training Full-Glow)
--dataset determines which dataset to choose. Dataloaders for the Cityscapes, MNIST, and maps datasets are already implemented here
--do_lu enables the use of LU decomposition which has a noticeable effect on training time
--reg_factor indicates the regularizer applied to the right-hand side of the objective function
--grad_checkpoint enables use of gradient checkpointing which is needed here for training on larger images

Description of packages in the project

data_handler contains implementation of data loaders for different datasets
evaluation contains code for evaluating the models
experiments has code for experiments such as content transfer and sampling
helper contains implementation of helper functions for dealing with files, directories, saving/loading checkpoints etc.
models contains implementation of Full-Glow, DUAL-Glow, and C-Glow
trainer has implementation of the training loop and loss function

Checkpoints

Checkpoints for all the Cityscapes models trained in this project (including C-Glow and DUAL-Glow) can be found here: https://kth.box.com/s/h3r9jt5pq8itrnkp0t2qy11pui7u6dmc

Notes

My implementation of the baseline Glow borrows heavily from Kim Seonghyeon's helpful implementation: https://github.com/rosinality/glow-pytorch

Citation

If you use our code or build on our method, please cite our paper:

@inproceedings{sorkhei2021full,
  author={Sorkhei, Moein and Henter, Gustav Eje and Kjellstr{\"o}m, Hedvig},
  title={Full-{G}low: {F}ully conditional {G}low for more realistic image generation},
  booktitle={Proceedings of the DAGM German Conference on Pattern Recognition (GCPR)},
  volume={43},
  month={Oct.},
  year={2021}
}

glow2
glow2 copied to clipboard

Metadata

Full-Glow

Quantitative results

Visual samples in 512x1024 resolution (please zoom to see more details)

Visual examples of content transfer (please zoom to see more details)

Samples generated on the maps dataset

Training

Arguments

Description of packages in the project

Checkpoints

Notes

Citation

← Metadata

Owner

Metadata

glow2 glow2 copied to clipboard

Metadata

Full-Glow

Quantitative results

Visual samples in 512x1024 resolution (please zoom to see more details)

Visual examples of content transfer (please zoom to see more details)

Samples generated on the maps dataset

Training

Arguments

Description of packages in the project

Checkpoints

Notes

Citation

← Metadata

Owner

Metadata

glow2
glow2 copied to clipboard