SEGAN
                                
                                 SEGAN copied to clipboard
                                
                                    SEGAN copied to clipboard
                            
                            
                            
                        A PyTorch implementation of SEGAN based on INTERSPEECH 2017 paper "SEGAN: Speech Enhancement Generative Adversarial Network"
SEGAN
A PyTorch implementation of SEGAN based on INTERSPEECH 2017 paper SEGAN: Speech Enhancement Generative Adversarial Network.
Requirements
- Anaconda
- PyTorch
conda install pytorch torchvision -c pytorch
- librosa
pip install librosa
Datasets
The clear and noisy speech datasets are downloaded from DataShare.
Download the 56kHZ train datasets and test datasets, then extract them into data directory.
If you want using other datasets, you should change the path of data defined on data_preprocess.py.
Usage
Data Pre-process
python data_preprocess.py
The pre-processed datas are on data/serialized_train_data and data/serialized_test_data.
Train Model and Test
python main.py ----batch_size 128 --num_epochs 300
optional arguments:
--batch_size             train batch size [default value is 50]
--num_epochs             train epochs number [default value is 86]
The test results are on results.
Test Audio
python test_audio.py ----file_name p232_160.wav --epoch_name generator-80.pkl
optional arguments:
--file_name              audio file name
--epoch_name             generator epoch name
The generated enhanced audio is on the same directory of input audio.
Results
The example results and the pre-train Generator weight can be downloaded from BaiduYun(access code:tzdd).