Retrieval-based-Voice-Conversion-WebUI Regarding inputted speaker content

Regarding inputted speaker content

Open suzhenghang opened this issue 1 year ago • 0 comments

Because retrieval features are used during inference to replace input features in order to prevent speaker identity leakage, but how can we ensure that the generated speech still corresponds to the original input content in terms of the speaker's voice?

feats = ( torch.from_numpy(npy).unsqueeze(0).to(self.device) * index_rate + (1 - index_rate) * feats )

Jul 10 '23 14:07 suzhenghang

Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI copied to clipboard

Regarding inputted speaker content

Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard