Yihui Fu

Results 14 comments of Yihui Fu

> 作者您好,昨天的问题不知道怎么就不见了。我邮箱收到了您的回复,训练loss是正常的。我现在依然在查找问题的根源所在。如果您能把test 模型的源码公开,这对我解决问题会有很大的帮助。有一点需要咨询一下您,计算loss时为什么是返回三个值?期望您的回复。 > > 昨天的问题是:测试模型时输出为静音。 代码都在之前的服务器上 所以现在也拿不出来。我给您一些思路看能不能解决你的问题: 0. 检查数据,看是否noisy和clean对齐了 1. 使用一条语音过我代码的stft和istft代码,看能不能完美重构,如果不能那就建议用torchaudio吧。我做这个工作那会torchaudio还没提复数谱的功能,现在支持了。 2. loss正常是指什么loss正常?三个loss是mae mse和sisnr,论文里有讲。

> 非常感谢您的回复。我将根据您提供的思路尝试解决这个问题。关于loss:训练loss是随epochs下降的。计算loss时return了三个值,比如:计算complex loss时: def calloss_cplxmse(output, source): # B 2 F T loss = 0 output_real, output_imag = output[:,0], output[:,1] source_real, source_imag = source[:,0], source[:,1] for i in range(output.shape[0]): loss_real =...

> 好的,感谢您的耐心回复,我是菜鸟所以问的比较多哈哈哈 没事没事 有问题随时交流哈

> Hi felix, Thanks for sharing this project, great work! I encountered some issues when running your code directly by: python uformer.py I have tried torch version from 1.8.1 to...

Hi Hervé BREDIN, Thank you for your question. The speaker label is only local to each file, which means 001-M of session 1 is not the same person as 001-M...

> 1. calloss_magmse全带幅度谱loss最后是除以batch和频点维度,而分段幅度谱loss(calloss_magmse_subband)除的是batch和帧数,因为output_mag.shape[2]应该是T维度; > 2. 另外请问为什么不在T维度求平均呢?一般使用F.l1_loss的reduction直接用Mean就会在batch和F和T求平均,这样是对效果有啥影响吗? 主要是为了让各部分loss在数值量级上比较接近 方便设置权重

> It looks like the amount of non-overlapped data is much smaller than the overall corpus. I am seeing less than 20 hours. Is this correct? > > Thanks Michael...

> hi yihui, thanks for awesome system implementation, I can only get the speaker info of the train set, but no speaker info for the test set. http://aishell-4.oss-cn-hangzhou.aliyuncs.com/spk_info.xlsx can you...

Sorry there is no training script. Maybe you can https://github.com/kaituoxu/Conv-TasNet as an example.

> The paper is one of the best out there, congrats! I am trying to run `uformer.py` but I get the following error: > > `RuntimeError: Given normalized_shape=[12], expected input...