Curisan
Curisan
您好,运行您的代码之后,并不能得到您在文件夹enh_noisy_example中展示的那么好的结果。我的loss和PESQ结果如下,和您的结果差不多,但实际语音增强的结果却差很多。有几个问题问一下: 1. 您迭代步数是多少?只有10000步吗? 2. 复现您的结果,有什么需要注意的吗?   (上图的噪音没有出现在训练集中))
@XIEchoAH 您好。按照网址下载的噪音Nonspeech是20KHz的,在训练之前,我用Audition把它们重采样为16KHz. 另外,prepare_data.py中函数create_mixture_csv中read_audio没有加采样频率,如果音频不是16KHz,确实会造成如你所说的情况(第3点)。其他的就没做什么变动了。
@liuquande I train the model use AVSpeech and meet the same problem. Do you use the pretrained model?
It seems there is a small bug in train.py and decode.py. In train.py: ``` supervision_segments = supervision_segments[indices] texts = supervisions['text'] assert feature.ndim == 3 # print(supervision_segments[:, 1] + supervision_segments[:, 2])...
Mm, thanks , I have seen these two material. But it is too little for me. Could you provide other material?
Thank you very much.