MaxMax2016 comments

Results 243 comments of


                                            MaxMax2016

is this better than diff svc and so vit svc?

@hongwen-sun 你我的目标不同，您期望的是一个音质效果好的SVC；我是在做技术方案的研究，比如whisper用于歌声转换的有效性，还有说话人自适应；我需要一个高效的方案来验证我的一些想法。

is this better than diff svc and so vit svc?

此外，我个人比较崇尚简洁。您如何看待 ‘奥卡姆剃刀原理’ 呢？

is this better than diff svc and so vit svc?

whisper是目前为止能获取到的效果最好的多语言ASR模型，它的好坏由他的识别率就可以直接体现，它通过海量的多语言数据训练得到，这是其他开源模型无法媲美的。如果您要做语音转换可能hubert等其他自监督模型比较合适，它们包含了其他非语言的信息、如语气&情感等，也许还有性别信息、以及可能的泄漏的音色；

is this better than diff svc and so vit svc?

@hongwen-sun 赞，有疑惑然后去验证，是个搞科研的好手！

Confused

same as: https://huggingface.co/spaces/maxmax20160403/sovits5.0

audio examples

yes, they are mixed men and women. maxgan_pretrain_32K.pth is trained by the data: https://github.com/Multi-Singer/Multi-Singer.github.io

you implementing vocos?

here we use bigvgan + neural source filter, so we do not have the problem in the picture. vocos is better than bigvgan without neural source filter. vocos is open...

num_samples=0

maybe there has error, when do step 0~ step 7, you can check the steps before.

Diff-svc, so-vits-svc效果对比

1，音色泄漏应该没有吧，可以用发布的模型进行测试，模型使用56个发音人训练 2，不支持语音转换，但是支持用语音数据训练歌声转换

Diff-svc, so-vits-svc效果对比

https://github.com/PlayVoice/lora-svc/blob/main/svc_preprocess_f0.py#L13 模型训练的时候，设置了最高音为900，需要根据您的实际数据修改这个参数来训练模型