StarGANv2-VC icon indicating copy to clipboard operation
StarGANv2-VC copied to clipboard

For song vc what should I do

Open panxin801 opened this issue 2 years ago • 9 comments

Hello and thank you sharing your great work, but I have some questions.

  1. For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
  2. In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.

panxin801 avatar Sep 14 '22 10:09 panxin801

Hello and thank you sharing your great work, but I have some questions.

  1. For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
  2. In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.

Hello, panxin! I'm also working on singing vc with StarGANv2-VC. I didn't retrain F0 and ASR model. Instead, I made a dataset consisting of Mandarin songs, Mandarin, Japanese and English speech. This is my result.

sophiefy avatar Sep 16 '22 04:09 sophiefy

@Francis-Komizu well, thank you for your reply, indeed I think starganvc using for song vc may need some further works to work out

panxin801 avatar Sep 16 '22 05:09 panxin801

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

yl4579 avatar Sep 16 '22 05:09 yl4579

@yl4579 Well, Congratulations. I'm looking forward for your works .

panxin801 avatar Sep 16 '22 05:09 panxin801

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

@yl4579, is INTERSPEECH 2022 September? If yes, can you share the paper link here

MuruganR96 avatar Nov 21 '22 18:11 MuruganR96

@mraj96 Sorry, I mean INTERSPEECH next year so it'll be 2023.

yl4579 avatar Nov 22 '22 03:11 yl4579

@yl4579 , thank you for your work on StarGANv2-vc. We have been working on making StarGANv2-vc workable on the singing domain. Please find our work https://arxiv.org/abs/2210.11096 which enhances StarGANv2-vc to make it work on the singing domain while working on any-to-any case.

mayank-git-hub avatar May 16 '23 12:05 mayank-git-hub

The main modification which makes StarGANv2-VC work on singing voice is the removal of pitch features from the instance normalization layers of the generator and having an absolute pitch reconstruction loss instead of a normalized pitch reconstruction loss.

mayank-git-hub avatar May 16 '23 12:05 mayank-git-hub

@mayank-git-hub Do you have a github for ROSVC? Couldn't find the source code, very interested!

billnye2 avatar Oct 21 '23 07:10 billnye2