fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

Brand new TTS solution

Results 162 fish-speech issues
Sort by recently updated
recently updated
newest added

![QQ图片20240516222344](https://github.com/fishaudio/fish-speech/assets/31399799/783021b0-f527-4628-a6a3-517c770175e2) ![QQ图片20240516222350](https://github.com/fishaudio/fish-speech/assets/31399799/de00cead-6ec0-4d1f-98d5-e3d3f97294c7) 一直报错无法使用。。

bug

**Describe the bug** 1.在WEB界面测试中,无论是使用参卡音频还是随即说话人,都无法正确读取文本“123456789ABCDEFG”。 2.在WEB界面测试中,生成的尾音,最后一个字无法读完.往往是在一半就停止了 **To Reproduce** ``` python python tools/webui.py \ --llama-checkpoint-path "checkpoints/text2semantic-sft-medium-v1.1-4k.pth" \ --llama-config-name dual_ar_2_codebook_medium \ --decoder-config-name vits_decoder_finetune \ --decoder-checkpoint-path "checkpoints/vits_decoder_v1.1.ckpt" ``` 1.在WEB界面中,将文本“123456789ABCDEFG”输入, 播放或听取生成的语音输出。 2.在WEB界面中,将文本 “由 Fish Audio...

bug

![image](https://github.com/fishaudio/fish-speech/assets/42288790/a6a6d92f-15e0-4097-acec-eaab791e26b8) https://huggingface.co/spaces/fishaudio/fish-speech-1 (webui) 这是推理出来的音频,复现概率很高(这个说话人是为了提高复现概率) [wav](https://fishaudio-fish-speech-1.hf.space/file=/tmp/gradio/aab5417c350356cafcc24fedf58edcba7be7383a/audio.wav)

bug

在使用这个命令的时候 报错python tools/llama/generate.py --text "床前明月光,疑似地上霜。举头望明月,低头思故乡" --prompt-text "1234567" --prompt-tokens "fake.npy" --config-name dual_ar_2_codebook_medium --checkpoint-path "checkpoints/text2semantic-sft-medium-v1.1-4k.pth" --num-samples 2 --compile

bug

使用API接口进行推理,参考音频是女生,生成的是男生

bug

用上了 LLAMA 1b 模型之后,对比以前的小模型(GPT SOVITS 的 AR) 在读音和语气上有了明显的提升 请问如果没有硬件限制的情况下,使用更大的模型(如7b/13b)会对合成效果有明显提升吗?

enhancement

Hi, Thank you for great work. But i got poor quality for synthesize japanese data. My data that has about 12hrs audios and 16 speakers was extracted from 3 visual...

作为新手小白看api很痛苦,不知道各位大佬能否提供一些说明文档和使用例子?

enhancement

![屏幕截图 2024-05-17 163400](https://github.com/fishaudio/fish-speech/assets/170086190/3b5eb81a-5325-4c09-b65d-7abc821a46a2)

bug

what is sft model? in text2semantic_sft.yaml, it have " ckpt_path: checkpoints/text2semantic-medium-v1-2k.pth resume_weights_only: true " What is the difference between sft and without sft?