Retrieval-based-Voice-Conversion-WebUI Asking about the long-promised RVCv3 pre-trained model

Hello!

Sincerely, I don't know if somebody already asked this question but I'm here to ask to the official developers of this project (@RVC-Boss @fumiama @yxlllc) if there is a ETA or a plan for a release of the new RVCv3 pre-trained model which promises (in the README):

"Please look forward to the base model of RVCv3 with larger parameters, larger dataset, better effects, basically flat inference speed, and less training data required."

Consider that this project is followed by an entire community of appassionate, developers, and also normal users who are not capable or not have sufficient resources to train their own models, so they are probably waiting for an update by you.

I admire the hard work that you put on this project almost everyday, so much that I decided to create an entire community of people in Discord with my friends with the unique objective of using RVC for cloning voices professionally with the best AI tools that are found in the open-source community (Resemblyzer, UVR5, Demucs, citing some examples just for better explanation).

I hope that everything is going well and hoping also to get a response by your team. Thank so much for your awesome work. See you soon and stay well.

NOTE: A useful issue about the opportunity of changing the HuBERT model: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/1542 (it could be a useful addition to RVCv3 architecture)

Apr 23 '24 16:04 GabryB03

We actually have some experimental models, but the performance improvements have not met expectations. A key point may be that the performance of contentvec (hubert_base.pt) pre-training limits the final upper limit, so now we are considering retraining a stronger contentvec model, but the training is very difficult, and the amount of data and computing power required are very high.

Apr 24 '24 06:04 yxlllc

I understand the issue that you are facing with ContentVec. To get a sufficient computing power, you could also try training the new models in a rent machine (https://vast.ai/) with powerful GPUs & CPUs, but this is an option depending on the budget and in what do you need for the training. I hope you will succeed and wish you the best luck with the continuation of the project.

Apr 24 '24 13:04 GabryB03

Hi @yxlllc, I think we could help you with these obstacles and move forward.

I am trying to train better hubert_base.pt (with ContentVec) because we realize that we reached the technical limits of RVC.

I'm in stage that I have prepared script for fine-tuning hubert with ContentVec and I'm doing experiments with it.

I would like to help you with the training. I can help you with the computing and also I have some traning scripts for hubert training, so I can share it with you. Please contact me here: [email protected]

Thank you very much for your work. We all appreciate it!

May 04 '24 05:05 Lukysoon

@Lukysoon thank you so much for your contribution to RVC!

I much hope that this will help the developers to get better results for the new RVCv3 architecture.

May 05 '24 06:05 GabryB03

Why not using speaker- invariant clustering (SPIN) instead of contentVec/HuBERT?

https://arxiv.org/pdf/2403.06260

May 12 '24 15:05 JackVinati

Is there any implementation of this. Have you already tried with RVC?

May 13 '24 14:05 Lukysoon

Is there any implementation of this. Have you already tried with RVC?

There is already the repo with the pretrained models: https://github.com/vectominist/spin

I would like to know what developers think about it. @RVC-Boss @fumiama @yxlllc

Ps I want to try to train it on an A100 80gb and see the results

May 13 '24 14:05 JackVinati

We actually have some experimental models, but the performance improvements have not met expectations.

@yxlllc Well, could you share one of them anyways? and just call it "RVC 2.5" or something?

May 20 '24 20:05 MethanJess

Any new news on this?

Jun 26 '24 17:06 zsxkib

Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI copied to clipboard

Asking about the long-promised RVCv3 pre-trained model

Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard