yong xu @ seattle

Results 61 comments of yong xu @ seattle

And the matlab version will help you to understand the whole decoding steps by DNN.

Hi, The original results are here: https://ieeexplore.ieee.org/document/6932438/ Best regards, yong ---------------------------------------------------------- Yong XU https://sites.google.com/view/xuyong/home From: akshayaCap Date: 2018-07-03 23:17 To: yongxuUSTC/DNN-for-speech-enhancement CC: yong xu @ surrey; Comment Subject: Re: [yongxuUSTC/DNN-for-speech-enhancement]...

Hi there, Please use the *.m matlab file to run it. It should work on win10.

Hi, the fea_dim means the dimension of one frame feature, e.g., log-power spectra. If the sample rate is 16khz, and you use 512 STFT, your fea_dim is 257 If the...

Hi, If you put your noisy_speech.wav in this folder : https://github.com/yongxuUSTC/DNN-Speech-enhancement-demo-tool/tree/master/wav_lsp The code will help you to generate my_own_noisy_speech.lsp automatically. Best regards, yong On Tue, 30 Oct 2018 at 20:24,...

comment the first line, just use cudaMemcpy. I have already updated it. Thanks for your pointing out. else{ //DevLinearOutCopy(streams[0],n_frames, cur_layer_units, cur_layer_x, cur_layer_y); cudaMemcpy(dev[0].out,cur_layer_y,n_framescur_layer_unitssizeof(float),cudaMemcpyDeviceToDevice); }

16Khz is the common sample rate, like the mobile phone recording. This code is not for 44.1khz, you need to down-sample your 44.1khz to 16khz using sox tool. Or you...

I think all you need to change is the dimension of the extracted log-power spectra. For 16khz, i used 512FFT to generate 257-dimension log-power spectra For 44.1khz, you may use...

Hi, No, GPU is needed for this program. However, haoyu li from Tokyo university created a Tensorflow version: https://github.com/yongxuUSTC/DNN-SpeechEnhancement Tensorflow should work well either on CPU or GPU. Best regards,...

Hi, Actually we have matlab version pretrained model, the model can be downloaded here: https://drive.google.com/file/d/0B5r5bvRpQ5DRR1lIV1hpZ0RLQ0E/view?usp=sharing also in the config/* The matlab script for decoding is here: https://github.com/yongxuUSTC/DNN-Speech-enhancement-demo-tool Best regards, yong...