Attacking Speaker Recognition Systems with Deep Generative Models

PyTorch implementation of Attacking Speaker Recognition Systems with Deep Generative Models.

Real and Fake Spectrograms

Pre-requisites

NVIDIA GPU + CUDA cuDNN

Data and pre-trained models:

Partial audio data
Pre-trained Generator and Discrminator using Cathy-Blizzard data

Setup

Clone this repo: git clone https://github.com/rafaelvalle/asrgen.git
CD into this repo: cd asrgen
Download and unzip audio data into this repo
Install python requirements: pip install -r requirements.txt

Training

python gan_train.py
(OPTIONAL) tensorboard --logdir=./

Synthesize audio samples with a Generator

jupyter notebook --ip=127.0.0.1 --port=31337
load gan_synthesis.ipynb

Acknowledgements

This implementation uses code from the following repos: [NVIDIA's Tacotron 2] (https://github.com/nvidia/tacotron2), Martin Arjovsky and Prem Seetharaman.

We are thankful to Prem Seetharaman and Markus Rabe for their feedback on the early draft of this paper.

We are grateful to NVIDIA for donating the Titan X used in this research.

asrgen
asrgen copied to clipboard

Metadata

Attacking Speaker Recognition Systems with Deep Generative Models

Pre-requisites

Data and pre-trained models:

Setup

Training

Synthesize audio samples with a Generator

Acknowledgements

← Metadata

Owner

Metadata

asrgen asrgen copied to clipboard

Metadata

Attacking Speaker Recognition Systems with Deep Generative Models

Pre-requisites

Data and pre-trained models:

Setup

Training

Synthesize audio samples with a Generator

Acknowledgements

← Metadata

Owner

Metadata

asrgen
asrgen copied to clipboard