WaveGAN-pytorch
WaveGAN-pytorch copied to clipboard
PyTorch implementation of " Synthesizing Audio with Generative Adversarial Networks"
WaveGAN-pytorch
PyTorch implementation of Synthesizing Audio with Generative Adversarial Networks(Chris Donahue, Feb 2018).
Befor running, make sure you have the sc09 dataset, and put that dataset under your current filepath.
Quick Start:
- Installation
sudo apt-get install libav-tools
- Download dataset
sc09: sc09 raw WAV files, utterances of spoken english words '0'-'9'piano: Piano raw WAV files
- Run
For sc09 task, make sure sc09 dataset under your current project filepath befor run your code.
$ python train.py
Training time
- For
SC09dataset, 4 X Tesla P40 takes nearly 2 days to get reasonable result. - For
pianopiano dataset, 2 X Tesla P40 takes 3-6 hours to get reasonable result. - Increase the
BATCH_SIZEfrom 10 to 32 or 64 can acquire shorter per-epoch time on multiple-GPU but slower gradient descent learning rate.
Results
Generated "0-9": https://soundcloud.com/mazzzystar/sets/dcgan-sc09
Generated piano: https://soundcloud.com/mazzzystar/sets/wavegan-piano
Loss curve:

Architecture

TODO
- [ ] Add some evaluation experiments, eg. inception score.
Contributions
This repo is based on chrisdonahue's and jtcramer's implementation.