Build Speech Enhancement Dataset

Build speech enhancement dataset.

Dependencies

time_domain: Speech level. The noisy waveform corresponds to the clean waveform.
time_domain_wav: Same as above, except that it will save the speech separately, instead of storing all the speech in the .pkl file.
frequency_domain_0: Speech level. The noisy spectrum corresponds to the clean spectrum, and they are the same size.
frequency_domain_1: Frame level. The noisy spectrum has multi-frames, and the clean speech is one frame. The center frame of the noisy spectrum is aligned with the frame of the clean speech.
frequency_domain_2: Frame level. The noisy spectrum is multi-frames, and the clean speech is multi-frames. They are the same numbers of frames.
mask_0: Frame level. The noisy spectrum has multi-frames, and the mask is one frame. The center frame of the noisy spectrum is aligned with the frame of the mask.

python [time_domain.py| time_domain_wav.py |frequency_domain_0.py|frequency_domain_1.py|mask_0.py] -C config.json