DNN-for-speech-enhancement icon indicating copy to clipboard operation
DNN-for-speech-enhancement copied to clipboard

higher quality enhancement?

Open ghost opened this issue 7 years ago • 3 comments

What if I want to train and test on 44.1kHz wav files, how do I modify the code to do that? It works for 16kHz, but I am interested in higher sampling rates for my applications. Thanks!

ghost avatar Mar 23 '17 14:03 ghost

16Khz is the common sample rate, like the mobile phone recording. This code is not for 44.1khz, you need to down-sample your 44.1khz to 16khz using sox tool.

Or you can train a 44.1khz DNN by yourself.

yongxuUSTC avatar Mar 25 '17 15:03 yongxuUSTC

Yes, this is what I am interested in, to train my own DNN for 44.1kHz, but all I am asking is how do I do that with your code? Can you just quickly point me to the direction I have to go, steps, etc? Thanks!

ghost avatar Mar 25 '17 17:03 ghost

I think all you need to change is the dimension of the extracted log-power spectra.

For 16khz, i used 512FFT to generate 257-dimension log-power spectra For 44.1khz, you may use 4096 or 2048 FFT which can generate 2049-dimension or 1025-dimension

Play with fun.

yongxuUSTC avatar Mar 25 '17 17:03 yongxuUSTC