AndongLi comments

Results 12 comments of


                                            AndongLi

Bad performance when using for speech enhancement

It seems to be caused by the choice of loss function, i.e., SI-SDR. SI-SDR does not restrict the magnitude of waveform, which may cause the the chopping effect. I think...

在test/eti文件下的语音是增强了的吗

> 在test/eti文件下的语音是增强了的吗，我听了一下里面的语言，都是嘈杂的，您好，我下载看了一下，应该是之前上传了错误的esti文件夹，不是增强后的语音文件。

> 还想请教以下有关里面损失函数的事情，能否加大最后一个stage的Loss权重。以及，我在训练了20个epoch后，Loss的大小还在70左右，这是否正常，如果能够告知，非常感谢。这个算是比较早的工作了，关于损失权重，由于会更专注于最终的输出，因此我看来更合理的方式是给与最后一个阶段更大的权重值，因此不妨尝试0.1(q=1,...Q-1),1(q=Q)的权重设置。另外，从模型自由度的角度看，不同阶段模型参数不共享是能够取得更好的结果的(但同样会出现模型性能饱和和高运算复杂度的问题)。关于损失大小，一个检查的方式是将Q=1,此时模型退化到单阶段的编解码结构，建议在这个基础上确定训练正常之后再调更大的阶段数，我的经验是模型收敛后损失在3到5之间都是正常的。

Mixloss 出现 nan

> 大佬，我用的是Mixloss，一运行loss就 nan. > 1、LR 我已经设置很小了（0.00001）； > 2、没有/0 情况；请问还有可能是什么原因呢？ This may be caused by the compressed coefficient \alpha, i.e., 0.3. You may as well calculate the gradient of the network...

多通道输出的标签问题

> 您好，看了您的论文《EMBEDDING AND BEAMFORMING: ALL-NEURAL CAUSAL BEAMFORMER FOR MULTICHANNEL SPEECH ENHANCEMENT》和对应的代码，有一个疑问：多通道的target是什么？看论文和代码都没有具体说，您是以某一个通道作为整体的target还是多个通道有多个target，然后分别进行处理的？您好，因为我们最后做的是一个filter-and-sum的操作，因此输入到输出是一个MISO的过程，用到的标签是参考通道的目标语音。如果您要利用多个通道目标得到多个通道输出，有两种方式，一个是MIMO，另一个是利用圆阵的旋转不变性依次推理M次(M代表通道数)。

Question: Loss not decreasing

> Hi! I'd like to reproduce the very good results of your paper. When I run it on a 15,000 samples * 3s dataset, loss only decreases considerably at the...

Question: Loss not decreasing

> Hi! I'd like to reproduce the very good results of your paper. When I run it on a 15,000 samples * 3s dataset, loss only decreases considerably at the...

can you send me the noise vector

> Hi andong,your paper use 115 types noise for training and 5 types noise for testing,but i can't get them all,can you send me the 16k librosa concated noise long...

can you send me the noise vector

> [email protected] Hi, the bin file has sent to your email.

RTNet on Asteroid ?

> Hi Andong, > > Congrats on your paper ! > Maybe you have seen [Asteroid](https://github.com/mpariente/asteroid), an open-source community-based source separation and speech enhancement toolkit. > RTNet would be a...