shoe-design-using-generative-adversarial-networks icon indicating copy to clipboard operation
shoe-design-using-generative-adversarial-networks copied to clipboard

Shoe Design using Generative Adversarial Networks

Author: Hoang Le - Hoang Nguyen

This repository developed a Generative Adversarial Nets (GANs) to design and generate new shoes images. Many improvement techniques were implemented to enhance the performance of the model as well as the quality of the output.

This was also used as a final project for the course Deep Learning for Visual Recognition Spring 2017 semester.

For more information, you guys can read our report.

Directory structure:

shoe-design-using-generative-adversarial-networks
│   README.md
|   report.pdf
|
|--- data: original dataset
| 
|--- experiment

In experiment directory, you can find our notebooks which define DCGAN Networks and some modifications.

Dataset

In this project, we consider shoes dataset: UT Zappos50K The dataset consists of over 50,000 images Sneaker type is dominant in the dataset (12856 images), so we decide to use only sneakers' images as the main data source.

We put the data dir like this: data/ut-zap50k/Shoes/Sneakers_and_athletic_shoes/

Methods

Baseline Model

  • All pooling layers are replaced with strided convolutions (discriminator) and fractional-strided convolutions (generator)
  • Batch normalization is applied
  • All fully connected layers are removed.
  • ReLU is used for all layers of the generator, except Tanh is used for the output, and LeakyReLU is used for all layers of discriminator. dcgan architecture

Improvement Techniques

  1. Objectives:
    • To generate higher-resolution images
    • To avoid the case that D over-perform G
  2. Detail implementations:
    • Use a modified loss function: min[log(1-D)]  max[logD]
    • Use a spherical noise: the noise will be sampled from a Gaussian distribution.
    • Use one-sided label smoothing: make the discriminator target output from [0=fake image, 1=real image] to [0=fake image, 0.9=real image].
    • Freezing: Stop update D when loss D < 0.7 loss G.

Results

Experiment 1 Pure DCGAN with spherical noise

ex1

Experiment 2

  • Normalized input
  • Weight decay
  • Modified loss

ex2

Experiment 3

  • Normalized input
  • Weight decay
  • Sided-label of 0.9 for real data

ex3

Experiment 4

  • Normalized input
  • Weight decay
  • Sided-label of 0.9 for real data
  • Freezing: Stop update D when loss D < 0.7 loss G.

ex4

Conclusions

  • GANs is extremely unstable and hard to train.
  • The common case is that the discriminator became too powerful and was able to easily make the distinction between real and fake images while the generator was still dumb.
  • With some methods proposed, the results from the generative model were improved.
  • There are some artifacts that we can easily observe.

Requirement

Acknowledgments

Thank Du Phan for your guidance, without it we cannot finish this project.

References

See our report for detail.