Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

What are the differences between the two versions of pretrained models?

Open treya-lin opened this issue 2 years ago • 2 comments

Hi, I wonder what are the differences between v2 and v1 pretrained models. Maybe I missed something but I didn't find much details about the pretrained models in documentation.

They use the same training dataset (VCTK) right? So what are the improvements/adjustments between the two versions, apart from the additional support of 32kHz sample rate?

Thanks!

treya-lin avatar Sep 28 '23 07:09 treya-lin

The v2 version model changes the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators. The training data is the same as v1.

RVC-Boss avatar Oct 09 '23 07:10 RVC-Boss

Where can I get the v2 version of hubert_base.onnx [256] ?

Bella-Tim avatar Apr 11 '24 06:04 Bella-Tim