asrgen
asrgen copied to clipboard
Attacking Speaker Recognition with Deep Generative Models
Attacking Speaker Recognition Systems with Deep Generative Models
PyTorch implementation of Attacking Speaker Recognition Systems with Deep Generative Models.
Pre-requisites
- NVIDIA GPU + CUDA cuDNN
Data and pre-trained models:
Setup
- Clone this repo:
git clone https://github.com/rafaelvalle/asrgen.git
- CD into this repo:
cd asrgen
- Download and unzip audio data into this repo
- Install python requirements:
pip install -r requirements.txt
Training
-
python gan_train.py
- (OPTIONAL)
tensorboard --logdir=./
Synthesize audio samples with a Generator
-
jupyter notebook --ip=127.0.0.1 --port=31337
- load
gan_synthesis.ipynb
Acknowledgements
This implementation uses code from the following repos: [NVIDIA's Tacotron 2] (https://github.com/nvidia/tacotron2), Martin Arjovsky and Prem Seetharaman.
We are thankful to Prem Seetharaman and Markus Rabe for their feedback on the early draft of this paper.
We are grateful to NVIDIA for donating the Titan X used in this research.