so-vits-svc-fork icon indicating copy to clipboard operation
so-vits-svc-fork copied to clipboard

Optional "breath" and "non verbal" voice sound training

Open tomakorea opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe.

After long hours of training, I got pretty good voice model, however, the way the training and infer handles "breaths", and non verbal voice sounds like laughs, screams, etc. doesn't really work well. Even though I used a studio sound quality set of 270 files (20 minutes of non stop voice) including breaths and laughs, the infer usually doesn't know well how to handle it. Especially breaths, sometimes it kinda works, sometimes it adds a lot of micro stuttering making it sounds like a robot. It would be great if we could have an option to train all of theses, with, or separately from the already trained model.

Describe alternatives you've considered

Actually, the only way to solve this is to re-do again and again the same line of dialogue until the process successfully generates a breath that is glitch free. But it can take up to 10 tries. The other solution I found is just to delete all the breaths but it makes the dialogue less natural.

Additional context

It would be cool to have a separate option to train theses, and maybe mix it up with the main dataset? because it wouldn't need to re-train really big models. I have several models that took me over 30 hours of training... Tools like iZotope RX De-Breath have a reverse option that keeps all the breaths and remove the voice. Using this, it would be easy to collect a lot of breaths from the same voice, and make it ready for training.

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

Are you willing to resolve this issue by submitting a Pull Request?

  • [ ] Yes, I have the time, and I know how to start.
  • [X] Yes, I have the time, but I don't know how to start. I would need guidance.
  • [ ] No, I don't have the time, although I believe I could do it if I had the time...
  • [ ] No, I don't have the time and I wouldn't even know how to start.

tomakorea avatar May 29 '23 19:05 tomakorea

same problem

CloudTronUSA avatar May 31 '23 07:05 CloudTronUSA