使用CosyVoice2-0.5B模型webui预训练音色不显示
用其他的模型没问题,用 --model_dir pretrained_models/CosyVoice2-0.5B 预训练音色就不显示
请问这是什么问题?
遇到同样的问题求解决
用其他的模型没问题,用 --model_dir pretrained_models/CosyVoice2-0.5B 预训练音色就不显示 请问这是什么问题?
解决了吗?同样遇到这个问题
遇到同样的问题求解决
解决了吗?同样遇到这个问题
就没提供预训练音色文件,不是不显示,你可以等sft版本,或者先把1.0模型里的spk2info.pt拿来用
就没提供预训练音色文件,不是不显示,你可以等sft版本,或者先把1.0模型里的spk2info.pt拿来用
感谢,可以了
https://github.com/FunAudioLLM/CosyVoice/issues/729#issuecomment-2545399338
- #729
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list.
To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list.您似乎遇到了 pretrained_models\CosyVoice2-0.5B 目录中缺少 spk2info.pt 文件的问题,这会导致 webui.py 脚本报告 sft_spk 变量为空列表。
To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.要解决此问题,您应该解压缩提供的 spk2info.zip 文件以获取 spk2info.pt 文件。解压后,将 spk2info.pt 文件放在 pretrained_models/CosyVoice2-0.5B 目录下。此文件对于模型至关重要,因为它包含其正常运行所需的关键扬声器信息。
感谢解答,我复制V1的spk2info.pt也是可以用的,但是有个问题,预训练音色的男声生成的都是女声,这是为什么?还有,2-0.5B模型不支持自然语言控制吗?我看有人就可以
还有,2-0.5B模型不支持自然语言控制吗?我看有人就可以
应该是支持的,命令行都有这个功能
我觉得 webui.py 的代码还没改好
还有,2-0.5B模型不支持自然语言控制吗?我看有人就可以
应该是支持的,命令行都有这个功能
我觉得 webui.py 的代码还没改好
我试了,提示只有300-Insruct那个模型才支持。
使用2-0.5B模型的自然语言控制,就提示 您正在使用自然语言控制模式, pretrained_models/CosyVoice2-0.5B模型不支持此模式, 请使用iic/CosyVoice-300M-Instruct模型
如何保存利用价值ptompt音乐学习到的音色为独立的模型?
Jandown @.***> 于 2024年12月20日周五 10:28写道:
使用2-0.5B模型的自然语言控制,就提示 您正在使用自然语言控制模式, pretrained_models/CosyVoice2-0.5B模型不支持此模式, 请使用iic/CosyVoice-300M-Instruct模型
— Reply to this email directly, view it on GitHub https://github.com/FunAudioLLM/CosyVoice/issues/738#issuecomment-2556079282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB636WBIHT4IPTTGBCOGS6D2GNXFVAVCNFSM6AAAAABTX625S2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNJWGA3TSMRYGI . You are receiving this because you commented.Message ID: @.***>
使用2-0.5B模型的自然语言控制,就提示 您正在使用自然语言控制模式, pretrained_models/CosyVoice2-0.5B模型不支持此模式, 请使用iic/CosyVoice-300M-Instruct模型
那个只是 webui 代码做的限制
使用2-0.5B模型的自然语言控制,就提示 您正在使用自然语言控制模式, pretrained_models/CosyVoice2-0.5B模型不支持此模式, 请使用iic/CosyVoice-300M-Instruct模型
那个只是 webui 代码做的限制
大佬有解决方案吗?
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list.
To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.
It works
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list.
To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.
请问大佬,这里提供的spkinfo和v1版本的计算方式一样吗?对比两个spkinfo.pt发现数值不一样?
大佬有没有2.0版本能用的webui.py??自带那个限制太多了
大佬有没有2.0版本能用的webui.py??自带那个限制太多了 +1
大佬有没有2.0版本能用的webui.py??自带那个限制太多了 +1
import sys
import gradio as gr
sys.path.append('third_party/Matcha-TTS')
from cosyvoice.cli.cosyvoice import CosyVoice2
from cosyvoice.utils.file_utils import load_wav
import torchaudio
import torch
cosyvoice = CosyVoice2('pretrained_models/CosyVoice2-0.5B', load_jit=False, load_trt=False, fp16=False)
def generate_audio(audio_path, tts_text, instruct_text):
if not audio_path or not tts_text or not instruct_text:
return None
prompt_speech = load_wav(audio_path, 16000)
# 生成音频
results = []
for i, j in enumerate(cosyvoice.inference_instruct2(
tts_text,
instruct_text,
prompt_speech,
stream=False
)):
output_path = f"output_{i}.wav"
torchaudio.save(output_path, j['tts_speech'], cosyvoice.sample_rate)
results.append(output_path)
if not results:
return None
# 拼接所有音频
waveforms = []
for path in results:
waveform, sr = torchaudio.load(path)
waveforms.append(waveform)
concatenated = torch.cat(waveforms, dim=1)
output_path = "output_combined.wav"
torchaudio.save(output_path, concatenated, cosyvoice.sample_rate)
return output_path
with gr.Blocks(title="CosyVoice TTS") as app:
gr.Markdown("## CosyVoice 语音合成系统")
with gr.Row():
with gr.Column():
ref_audio = gr.Audio(label="参考音频", type="filepath")
tts_text = gr.Textbox(label="合成文本", placeholder="输入要合成的文本...")
instruct_text = gr.Textbox(label="风格指令", placeholder="输入语音风格指令...")
generate_btn = gr.Button("生成语音", variant="primary")
with gr.Column():
audio_output = gr.Audio(label="生成结果", interactive=False)
generate_btn.click(
fn=generate_audio,
inputs=[ref_audio, tts_text, instruct_text],
outputs=audio_output
)
if __name__ == "__main__":
app.launch(server_name="0.0.0.0", server_port=7860, share=False)
用这个可以救个急,风格指令部分结尾需要加上<|endofprompt|>,但是我发现参考音频即便选择了男声生成的也是一股女声的感觉。
....邮件已收到,一会儿回复哦
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list.
To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.
not working.The choices appear,but generate bad voice.
....邮件已收到,一会儿回复哦
It appears that you're encountering an issue with the absence of the spk2info.pt file in the pretrained_models\CosyVoice2-0.5B directory, which is causing the webui.py script to report that the sft_spk variable is an empty list. To resolve this, you should unzip the provided spk2info.zip file to obtain the spk2info.pt file. After extracting it, place the spk2info.pt file within the pretrained_models/CosyVoice2-0.5B directory. This file is essential for the model, as it contains critical speaker information required for its proper operation.
not working.The choices appear,but generate bad voice.
I am not surprised when you mentioned it generated bad voice --
because in my understanding, Cosy Voice 2 no longer uses speaker embedding v in the training as Cosy Voice 1. The way you used the function inference_sft() is by loading a CV2 backbone model and a MISMATCHED CV1 spk2info.pt together. It can run and generate sound, but its behavior is unpredicted.
....邮件已收到,一会儿回复哦
