GPT-SoVITS 推理输出的语音与原声相差很大

推理输出的语音与原声相差很大

Open finalbattle opened this issue 7 months ago • 0 comments

先给出我的推理结果和原声音吧，相差很大，主要是声音沙哑！推理结果： zhaozhongxiang_inference_audio.wav.txt 原声： zhaozhongxiang_yueyanglouji_no_bg.wav_0000616960_0000909760.wav.txt 使用的切片方式：Slice by every punct 在一键三连的时候，会报这个warning，不知道有没有影响？ Some weights of the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base were not used when initializing HubertModel: ['encoder.pos_conv_embed.conv.weight_g', 'encoder.pos_conv_embed.conv.weight_ v']

This IS expected if you are initializing HubertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForP reTraining model).
This IS NOT expected if you are initializing HubertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassif ication model).
Some weights of HubertModel were not initialized from the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base and are newly initialized: ['encoder.pos_conv_embed.conv.parametrizations.weight.ori ginal0', 'encoder.pos_conv_embed.conv.parametrizations.weight.original1']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/zhangpeng/.conda/envs/GPTSoVits/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")

Jul 27 '24 08:07 finalbattle

GPT-SoVITS GPT-SoVITS copied to clipboard

推理输出的语音与原声相差很大

GPT-SoVITS
GPT-SoVITS copied to clipboard