Conv-TasNet Bad performance when using for speech enhancement

Hi, very nice work. I noticed that some people are using Conv-TasNet for speech enhancement and get good results. While I encountered some problem while using this code for speech enhancement... I am trying to split clean speech and noise from a noisy speech. I am using VCTK dataset. The waveform of the results seem very weird...

GetImage

When I changed the activation of mask to sigmoid, the result is still not good.

GetImage (1)

I wonder anyone has a thought how to solve this problem. Thanks in advance!

Aug 11 '20 17:08 jkzhang7

It seems to be caused by the choice of loss function, i.e., SI-SDR. SI-SDR does not restrict the magnitude of waveform, which may cause the the chopping effect. I think you can replace SI-SDR loss with other loss options like SNR or wave-L_1.

Aug 14 '20 16:08 Andong-Li-speech

@Andong-Li-speech Hi, thanks for your suggestions! While the result seems still not very good after changing the loss function to SNR loss... But it works much better! I wonder if you are also working on this part, what kind of loss function are you using? Thanks a lot in advance!

Aug 18 '20 23:08 jkzhang7

@jkzhang7 Hi, do you get a better performance? I face the same problem now. Best wishes to you!

Aug 28 '20 11:08 LittleFlyingSheep

@LittleFlyingSheep Hi～ Did you solved this problem now? seem to meet the same problem , the magnitude of separate waveform is too big and sounds not very well, thanks a lot if you could give me some advice~

Jun 08 '21 03:06 forestlee95

@forestlee95 One way I choose to solve it is to scale the waveform artificially. I choose the max value of the input noisy and divide it with the output. This method will get a relatively good performance. This is just my helpless action. If you have any other methods, please letter me.

Jun 08 '21 05:06 LittleFlyingSheep

@LittleFlyingSheep @jkzhang7 Hi, I am looking for the speech enhancement performance of conv-tasnet on vctk dataset, do you guys have any performance data about it? Much appreciated.

Mar 22 '22 14:03 sewichou

收到

Mar 22 '22 14:03 LittleFlyingSheep

Hi, very nice work. I noticed that some people are using Conv-TasNet for speech enhancement and get good results. While I encountered some problem while using this code for speech enhancement... I am trying to split clean speech and noise from a noisy speech. I am using VCTK dataset. The waveform of the results seem very weird...

When I changed the activation of mask to sigmoid, the result is still not good.

I wonder anyone has a thought how to solve this problem. Thanks in advance! How did you solve it?i meet the same bug while testing

Aug 31 '22 07:08 yyd19948

收到

Oct 11 '22 08:10 LittleFlyingSheep

Conv-TasNet Conv-TasNet copied to clipboard

Bad performance when using for speech enhancement

Conv-TasNet
Conv-TasNet copied to clipboard