GPT-SoVITS issues

这个报错怎么整

5

py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result...

709411fan

In follow-up

合成的音频中“换行（huan hang）”读成了“huang xing”

8

合成的文本为“文本切分工具。太长的文本合成出来效果不一定好，所以太长建议先切。合成会根据文本的换行分开合成再拼起来。”，合成后，有时会吞字，例如把“文本切分工具。”丢了，猜测应该是分词的原因。另外，这句中的“换行”本应为“huan hang”，合成后的音频却读成了“huang xing”，是tts引擎的问题吗？

zjzkiss

todolist

Questions about the scale and quality of training data, and the possibility of releasing the training data in the future

5

This project is great. I would like to inquire about the scale of the data used to train the model, and the quality of the data (whether it's accurately labeled...

hertz-pj

reference的语音如果用高质量的他人语音融合会有很好的效果

2

本来就是想试试如果模型用自己的，reference用其他人的会是什么效果。结果发现似乎是做了个加法，音色是两边求平均获得了一种新的语音。然后我自己的模型没有英文语音所以说不好英语，但是如果用声音工作者的英文演讲作为reference就可以说出很好的英文甚至中英文混合了。虽然音色不是自己的，仿佛就是音色等于自己+reference，然后语气语调使用了reference，大赞。 [例子](https://github.com/RVC-Boss/GPT-SoVITS/assets/17892787/d421cee9-5a0c-447f-a9a2-aa610676ec23)

shadowwider

多段文本切割后分别生成，而不是直接生成一个完整的句子，或者对生成音频切割，重新推理较差片段

长句切割后，根据切割后的片段，生成多个不同的音频，是否可以通过对不满意的片段进行抽卡，然后最后再合并为一个长音频。或者添加选项，手动对长音频进行切割(就类似于subfix中对音频进行切割)，然后对较差片段进行重新推理。

a-cold-bird

是否可以添加连抽功能

一次多抽，减少抽卡次数，并且方便生成的多个音频的效果进行比较

a-cold-bird

Failed to load audio files?

1

My file path is correct, double checked... but still getting below error, can someone pls help? (replica) PS D:\dev\replica\GPT-SoVITS> python webui.py Running on local URL: http://0.0.0.0:9874 "D:\conda_envs\replica\python.exe" tools/slice_audio.py "D:\dev\replica\audio\input" "output\slicer_opt"...

Jeru2023

Japanese fine-tuning

4

From what I understand, the model currently requires fine-tuning on at least 2-3 hours of speech data to produce convincing results in Japanese. Is this correct? Additionally, is it necessary...

Kamikadashi