Yi Hsun Chen
Yi Hsun Chen
真的很感謝您的回覆,您說的log Mel spectrogram與Mel Filter Bank有關係嗎?log Mel spectrogram沒有取平方也是因為PESQ的測量結果嗎?
log Mel spectrogram沒有取平方也是因為PESQ的測量結果嗎?還是因為loss改變而跟著改變呢?感謝您~
@Curisan 您好,想问一下您复现代码的时候,您的train set大概是幾句話呢?以及test set語句是隨機挑選的嗎?又大概是幾句話? 我使用的Training set為TIMIT大約4000筆句子,噪音庫為 N1-N100中的其中94種,另外Testing set為TIMIT168筆句子,噪音庫為噪音庫為 N1-N100中的其中6種,且種類與您相同,並且將sample rate也更改成16k。迭代次数為100000次,但結果如下  不知道是哪邊出了問題?
想請問一下該如何做才能提昇PESQ呢? 以下是在SNR=20的實驗結果 
Sorry In addition, I would like to ask if I want to use this speech-enhanced system in the front of the ASR. How do I do this? Many thanks, Nick
Hello Qiuqiang, Mat_2d_to_3d is to convert features to (n_segs, n_concat, n_freq). The center frame of the first round of stacking frames is t=1, and the center frame of the second...
Hi Yong, Thank you for your replying! There are some questions I'd like to ask: 1. The "enhanced features for ASR" you mentioned, do you mean the magnitudes of log...