DNN-for-speech-enhancement
DNN-for-speech-enhancement copied to clipboard
higher quality enhancement?
What if I want to train and test on 44.1kHz wav files, how do I modify the code to do that? It works for 16kHz, but I am interested in higher sampling rates for my applications. Thanks!
16Khz is the common sample rate, like the mobile phone recording. This code is not for 44.1khz, you need to down-sample your 44.1khz to 16khz using sox tool.
Or you can train a 44.1khz DNN by yourself.
Yes, this is what I am interested in, to train my own DNN for 44.1kHz, but all I am asking is how do I do that with your code? Can you just quickly point me to the direction I have to go, steps, etc? Thanks!
I think all you need to change is the dimension of the extracted log-power spectra.
For 16khz, i used 512FFT to generate 257-dimension log-power spectra For 44.1khz, you may use 4096 or 2048 FFT which can generate 2049-dimension or 1025-dimension
Play with fun.