GPT-SoVITS icon indicating copy to clipboard operation
GPT-SoVITS copied to clipboard

在AI帮忙下添加了一个自定义多音字发音的功能

Open NewEpoch2020 opened this issue 6 months ago • 15 comments

在AI帮忙下添加了一个自定义多音字音素的功能, 仅供参考.

首先在根目录下添加个一个"自定义多音字.json"文件, 里面可以自定义多音字的发音, 例如:

{
    "大夫": ["d", "ai4", "f", "u5"],
    "换行": ["h", "uan4", "h", "ang2"]
}

然后修改GPT_SoVITS/inference_webui.py 文件

# 读取json
def load_duoyin_dict():
    # 获取当前文件的目录
    current_dir = os.path.dirname(os.path.abspath(__file__))
    # 获取上层目录
    parent_dir = os.path.dirname(current_dir)
    # 构建多音字标注.json的路径
    json_path = os.path.join(parent_dir, '自定义多音字.json')

    with open(json_path, 'r', encoding='utf-8') as f:
        return json.load(f)

duoyin_dict = load_duoyin_dict()

# 查找并替换自定义多音字的读音
def find_custom_tone(phones, word2ph, norm_text):
    for word, custom_phones in duoyin_dict.items():
        if word in norm_text:
            # 找到多音字在文本中的位置
            start_index = norm_text.index(word)
            end_index = start_index + len(word)

            # 计算多音字在音素列表中的起始和结束位置
            start_phone_index = sum(word2ph[:start_index])
            end_phone_index = start_phone_index + sum(word2ph[start_index:end_index])

            # 用自定义音素替换原有音素
            phones[start_phone_index:end_phone_index] = custom_phones
   
    return phones
    
# 修改原始clean_text_inf函数
def clean_text_inf(text:str, language, version):
    # 使用clean_text函数处理文本,得到初步的音素列表等信息
    phones, word2ph, norm_text = clean_text(text, language, version)

    # 使用find_custom_tone函数处理多音字
    phones = find_custom_tone(phones, word2ph, norm_text)

    # 将音素列表转换为序列形式
    phones = cleaned_text_to_sequence(phones, version)

    return phones, word2ph, norm_text 

NewEpoch2020 avatar Aug 17 '24 09:08 NewEpoch2020