StarGANv2-VC
StarGANv2-VC copied to clipboard
For song vc what should I do
Hello and thank you sharing your great work, but I have some questions.
- For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
- In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.
Hello and thank you sharing your great work, but I have some questions.
- For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
- In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.
Hello, panxin! I'm also working on singing vc with StarGANv2-VC. I didn't retrain F0 and ASR model. Instead, I made a dataset consisting of Mandarin songs, Mandarin, Japanese and English speech. This is my result.
@Francis-Komizu well, thank you for your reply, indeed I think starganvc using for song vc may need some further works to work out
@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.
@yl4579 Well, Congratulations. I'm looking forward for your works .
@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.
@yl4579, is INTERSPEECH 2022 September? If yes, can you share the paper link here
@mraj96 Sorry, I mean INTERSPEECH next year so it'll be 2023.
@yl4579 , thank you for your work on StarGANv2-vc. We have been working on making StarGANv2-vc workable on the singing domain. Please find our work https://arxiv.org/abs/2210.11096 which enhances StarGANv2-vc to make it work on the singing domain while working on any-to-any case.
The main modification which makes StarGANv2-VC work on singing voice is the removal of pitch features from the instance normalization layers of the generator and having an absolute pitch reconstruction loss instead of a normalized pitch reconstruction loss.
@mayank-git-hub Do you have a github for ROSVC? Couldn't find the source code, very interested!