Lai
Lai
Traceback (most recent call last): File "kantts/bin/train_sambert.py", line 224, in train trainer.train() File "/home/l/project/KAN-TTS_/kantts/train/trainer.py", line 210, in train self.train_epoch() File "/home/l/project/KAN-TTS_/kantts/train/trainer.py", line 223, in train_epoch self.check_eval_interval() File "/home/l/project/KAN-TTS_/kantts/train/trainer.py", line 200,...
**Describe the bug** 1.在WEB界面测试中,无论是使用参卡音频还是随即说话人,都无法正确读取文本“123456789ABCDEFG”。 2.在WEB界面测试中,生成的尾音,最后一个字无法读完.往往是在一半就停止了 **To Reproduce** ``` python python tools/webui.py \ --llama-checkpoint-path "checkpoints/text2semantic-sft-medium-v1.1-4k.pth" \ --llama-config-name dual_ar_2_codebook_medium \ --decoder-config-name vits_decoder_finetune \ --decoder-checkpoint-path "checkpoints/vits_decoder_v1.1.ckpt" ``` 1.在WEB界面中,将文本“123456789ABCDEFG”输入, 播放或听取生成的语音输出。 2.在WEB界面中,将文本 “由 Fish Audio...
"我发现背景分割和眨眼功能存在一些瑕疵,想请教是否有可能传入视频,单独对嘴部进行推理处理。谢谢您的帮助!"
越复用CosyVoice2,电音严重,模型已经更新到最新。 ```python cosyvoice = CosyVoice2('pretrained_models/CosyVoice2-0.5B', load_jit=False, load_trt=False, fp16=False) # NOTE if you want to reproduce the results on https://funaudiollm.github.io/cosyvoice2, please add text_frontend=False during inference # zero_shot usage prompt_speech_16k = load_wav('test_clone.wav',...