SoundStream icon indicating copy to clipboard operation
SoundStream copied to clipboard

Problems about this project.

Open linan06kuaishou opened this issue 3 years ago • 6 comments

Firstly, thank you for sharing this code. And i trained with vctk data set. But unfortunately, i didn't get good result. These are the main problems i found: 1 The audio generated by the Generator are just some tune noise and totally irrelevant to the input signal, even after several epochs of training. 2 I have noticed that the different component of g_loss are badly unbalanced, the adv loss is about 1e0 magnitude, feat loss is about 1e3 magnitude, rec loss is about 1e6 magnitude. I have tried to scale them to the same magnitude but it didn't seem to help to the final output signal. 3 I tried to implement the paper myself and got bad quality audio signal. There must be some mistakes and i really don't have a clue. @wesbz Have you encounter the problems above? Or have you get promising result with this project?

linan06kuaishou avatar Dec 23 '21 10:12 linan06kuaishou

Hi, thanks for your comments! I unfortunately did find time to test that model all the way through. If you find something, feel free to send a pull request. I'll let you know if I find something ;)

wesbz avatar Dec 25 '21 17:12 wesbz

I also commit this problem, the result is terrible.

liuyoude avatar Jan 20 '22 06:01 liuyoude

Feel free to pursue and do a PR ;)

wesbz avatar Jan 20 '22 17:01 wesbz

I also got the bad qulity results, have you solved it yet?

MasterEndless avatar Feb 25 '22 04:02 MasterEndless

Anybody train the model successfully?

Liujingxiu23 avatar Mar 03 '23 03:03 Liujingxiu23

Firstly, thank you for sharing this code. And i trained with vctk data set. But unfortunately, i didn't get good result. These are the main problems i found: 1 The audio generated by the Generator are just some tune noise and totally irrelevant to the input signal, even after several epochs of training. 2 I have noticed that the different component of g_loss are badly unbalanced, the adv loss is about 1e0 magnitude, feat loss is about 1e3 magnitude, rec loss is about 1e6 magnitude. I have tried to scale them to the same magnitude but it didn't seem to help to the final output signal. 3 I tried to implement the paper myself and got bad quality audio signal. There must be some mistakes and i really don't have a clue. @wesbz Have you encounter the problems above? Or have you get promising result with this project?

Have you successfully trained the model?

iam-Yue avatar May 14 '24 14:05 iam-Yue